vdiff

package
v0.21.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 3, 2024 License: Apache-2.0 Imports: 43 Imported by: 0

README

Work In Progress document that describes some key components of vdiff2.

Most of the vdiff2 code is a reorganization of vdiff, with a few changes which are due to some key functionality differences in vdiff2 from vdiff. See the RFC at https://github.com/vitessio/vitess/issues/10134 for more info.

Key Differences from VDiff2

  • VDiff2 runs on tablets and not on vtctld. There is a single vtctld process that runs synchronously getting data from all target and source shards and compares them. In VDiff2 each shard compares only the data belonging to that target and gets data from related sources.

  • VDiff2 is designed to be resumable. VDiff1 had to be reinvoked in case of any failure. This requires us to switch from using the ResultStreamer to the RowStreamer. The RowStreamer can start from a provided LastPK and also supplies the LastPK in each batch for resumability.

  • VDiff2 runs asynchronously. The vtctld VDiff command will create the vdiff streams, one on each shard. The streams will be run on the current primary and record progress. The vtctld command will also support actions which can manage vdiff's state and report progress.

What stays the same

The rest of the code mostly a cut/paste with minor refactoring. Instead of the entire code residing in a single vdiff.go file we split it in terms of modules in the tabletmanager/vdiff directory

  • The data comparison algo is identical

  • The mechanism of getting the workflow streams in sync and generating the snapshots is the same, except that only one target is involved per-shard

Performance Impact

Pros

  • Since it is running on multiple tablets in parallel, VDiff2 should be much faster on a sharded keyspace.
  • Being resumable, VDiff2 will be more resilient.
    • If VDiff2 is run in resumable mode, then we can take smaller snapshots than the single long-running one that VDiff needs.
  • Special provisioning of vtctld servers are not needed.
  • Cross-cluster network traffic will be reduced since data being compared by targets will be local
  • VDiff2 will eliminate the remaining use case for the vtctl (vtctlclient+vtctld) binary -- vdiff's on large tables that could take more than a day to complete.
Cons
  • Each source would only have one vstreamer running for each vdiff. Now all targets to which the source contributes will have one vstreamer each. This depends on the sharding configurations of source and target.
  • The per-shard comparison happens on the target primary. So if the target is production-serving then there will an extra CPU utilization potentially requiring a larger provisioning on the target.

Setup

The user invokes vtctlclient VDiff on an existing workflow by specifying the target keyspace and workflow and overriding any of the default options that are described in the documentation of the vdiff2 command on the vitessio website.

Database Schema

A new set of tables have been added to the _vt database to support vdiff2.

vdiff

Contains one row per workflow for which vdiff has been requested. Currently only one vdiff can run at a time for a given workflow, so if it has to be restarted its state will be reset. It maintains the state of the vdiff and options specified.

vdiff_table

One row for each table involved in a vdiff. Each table maintains the most recent state: status of each table, last PK, number of rows compared, etc. Any mismatch is flagged immediately and the report of the diff is updated when the diff ends or stops for any reason.

vdiff_log

Free format log of important events in a vdiff such as state changes and errors.

Modules

This section contains a short note on the key modules in this directory.

Engine (engine.go)

This is a singleton in tabletserver similar to other engines. Each active vdiff has its own controller which is managed by the engine. On a primary, when the engine is opened, vdiff is started/resumed on all active controllers.

It also has the related table schema.

Controller (controller.go)

There is one controller for each active vdiff. It maintains the configuration required to run a VDiff including related sources, tablet picker options, an instance of the TabletManagerClient to make rpc calls to other targets, It sets up a WorkflowDiffer which performs the actual diff.

Workflow Differ (workflow_differ.go)

Sets up all the tables that needed to be diffed using the TableDiffer and serially invokes each diff.

Table Differ (table_differ.go)

This is the main module that runs a diff on each table, keeps intermediate state and periodically updates this in the _vt tables.

Shard Streamer (shard_streamer.go)

Primitive Executor (primitive_executor.go)

Merge Sorter (utils.go)

Miscellaneous Notes

Since we stream using the standard vstreamer API we automatically subscribe to the lag throttler mechanism, which will delay the streaming in case the replication lag on the specific shard is above threshold.

VDiff2 can recover from the following issues:

  • Source or target tablets, which are streaming the data, failing/restarting
  • A PRS on the target tablet on which VDiff is running
  • User stopping/restarting the operation
  • Network failures

Documentation

Overview

package vdiff implements the Vitess Diff functionality which compares the logical contents of MySQL tables across keyspaces and shards

Please see ./README.md for more information.

Index

Constants

This section is empty.

Variables

View Source
var BackgroundOperationTimeout = topo.RemoteOperationTimeout * 4

how long to wait for background operations to complete

View Source
var ErrMaxDiffDurationExceeded = vterrors.Errorf(vtrpcpb.Code_DEADLINE_EXCEEDED, "table diff was stopped due to exceeding the max-diff-duration time")
View Source
var ErrVDiffStoppedByUser = vterrors.Errorf(vtrpcpb.Code_CANCELED, "vdiff was stopped by user")

Functions

This section is empty.

Types

type DiffMismatch

type DiffMismatch struct {
	Source *RowDiff `json:"Source,omitempty"`
	Target *RowDiff `json:"Target,omitempty"`
}

DiffMismatch is a sample of row diffs between source and target.

type DiffReport

type DiffReport struct {
	TableName string

	// counts
	ProcessedRows   int64
	MatchingRows    int64
	MismatchedRows  int64
	ExtraRowsSource int64
	ExtraRowsTarget int64

	// actual data for a few sample rows
	ExtraRowsSourceDiffs []*RowDiff      `json:"ExtraRowsSourceSample,omitempty"`
	ExtraRowsTargetDiffs []*RowDiff      `json:"ExtraRowsTargetSample,omitempty"`
	MismatchedRowsDiffs  []*DiffMismatch `json:"MismatchedRowsSample,omitempty"`
}

DiffReport is the summary of differences for one table.

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

func NewEngine

func NewEngine(ts *topo.Server, tablet *topodata.Tablet, collationEnv *collations.Environment, parser *sqlparser.Parser) *Engine

func NewTestEngine added in v0.16.0

func NewTestEngine(ts *topo.Server, tablet *topodata.Tablet, dbn string, dbcf func() binlogplayer.DBClient, tmcf func() tmclient.TabletManagerClient) *Engine

NewTestEngine creates an Engine for use in tests. It uses the custom db client factory and tablet manager client factory, while setting the fortests field to true to modify any engine behavior when used in tests (e.g. not starting the retry goroutine).

func (*Engine) Close

func (vde *Engine) Close()

Close closes the Engine service.

func (*Engine) InitDBConfig

func (vde *Engine) InitDBConfig(dbcfgs *dbconfigs.DBConfigs)

func (*Engine) IsOpen

func (vde *Engine) IsOpen() bool

IsOpen returns true if Engine is open.

func (*Engine) Open

func (vde *Engine) Open(ctx context.Context, vre *vreplication.Engine)

func (*Engine) PerformVDiffAction

func (vde *Engine) PerformVDiffAction(ctx context.Context, req *tabletmanagerdatapb.VDiffRequest) (resp *tabletmanagerdatapb.VDiffResponse, err error)

type ProgressReport added in v0.15.0

type ProgressReport struct {
	Percentage float64
	ETA        string `json:"ETA,omitempty"` // a formatted date
}

type RowDiff

type RowDiff struct {
	Row   map[string]string `json:"Row,omitempty"`
	Query string            `json:"Query,omitempty"`
}

RowDiff is a row that didn't match as part of the comparison.

type VDiffAction

type VDiffAction string // nolint
const (
	CreateAction  VDiffAction = "create"
	ShowAction    VDiffAction = "show"
	StopAction    VDiffAction = "stop"
	ResumeAction  VDiffAction = "resume"
	DeleteAction  VDiffAction = "delete"
	AllActionArg              = "all"
	LastActionArg             = "last"
)

type VDiffState

type VDiffState string //nolint

VDiff operation and table states: pending/started/stopped/completed/error/unknown

const (
	PendingState    VDiffState = "pending"
	StartedState    VDiffState = "started"
	StoppedState    VDiffState = "stopped"
	CompletedState  VDiffState = "completed"
	ErrorState      VDiffState = "error"
	UnknownState    VDiffState = ""
	TimestampFormat            = "2006-01-02 15:04:05"
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL