scheduler

package

v1.1.0-beta.0...-2761fe0 Latest Latest Go to latest Published: Sep 22, 2024 License: Apache-2.0 Imports: 30 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/pingcap/tidb

Documentation ¶

Index ¶

Variables
func ClearSchedulerCleanUpFactory()
func ClearSchedulerFactory()
func GetLiveExecIDs(ctx context.Context) ([]string, error)
func IsCancelledErr(err error) bool
func RegisterSchedulerCleanUpFactory(taskType proto.TaskType, ctor cleanUpFactoryFn)
func RegisterSchedulerFactory(taskType proto.TaskType, ctor schedulerFactoryFn)
func VerifyTaskStateTransform(from, to proto.TaskState) bool
type BaseScheduler
- func NewBaseScheduler(ctx context.Context, task *proto.Task, param Param) *BaseScheduler
- func (*BaseScheduler) Close()
- func (s *BaseScheduler) GetPreviousSubtaskMetas(taskID int64, step proto.Step) ([][]byte, error)
- func (s *BaseScheduler) GetTask() *proto.Task
- func (*BaseScheduler) Init() error
- func (s *BaseScheduler) ScheduleTask()
- func (s *BaseScheduler) WithNewSession(fn func(se sessionctx.Context) error) error
- func (s *BaseScheduler) WithNewTxn(ctx context.Context, fn func(se sessionctx.Context) error) error
type CleanUpRoutine
type Extension
- func GetTestSchedulerExt(ctrl *gomock.Controller) Extension
type Manager
- func NewManager(ctx context.Context, taskMgr TaskManager, serverID string) *Manager
- func (sm *Manager) Cancel()
- func (sm *Manager) Initialized() bool
- func (sm *Manager) MockScheduler(task *proto.Task) *BaseScheduler
- func (sm *Manager) Start()
- func (sm *Manager) Stop()
type NodeManager
type Param
type Scheduler
type SlotManager
type TaskManager

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// CheckTaskFinishedInterval is the interval for scheduler.
	// exported for testing.
	CheckTaskFinishedInterval = 500 * time.Millisecond
	// RetrySQLTimes is the max retry times when executing SQL.
	RetrySQLTimes = 30
	// RetrySQLInterval is the initial interval between two SQL retries.
	RetrySQLInterval = 3 * time.Second
	// RetrySQLMaxInterval is the max interval between two SQL retries.
	RetrySQLMaxInterval = 30 * time.Second
)

View Source

var (
	// CheckTaskRunningInterval is the interval for loading tasks.
	// It is exported for testing.
	CheckTaskRunningInterval = 3 * time.Second

	// DefaultCleanUpInterval is the interval of cleanup routine.
	DefaultCleanUpInterval = 10 * time.Minute
)

View Source

var MockServerInfo atomic.Pointer[[]string]

MockServerInfo exported for scheduler_test.go

View Source

var TestRefreshedChan = make(chan struct{})

TestRefreshedChan is used to sync the test.

View Source

var WaitCleanUpFinished = make(chan struct{}, 1)

WaitCleanUpFinished is used to sync the test.

View Source

var WaitTaskFinished = make(chan struct{})

WaitTaskFinished is used to sync the test.

Functions ¶

func ClearSchedulerCleanUpFactory ¶

func ClearSchedulerCleanUpFactory()

ClearSchedulerCleanUpFactory is only used in test.

func ClearSchedulerFactory ¶

func ClearSchedulerFactory()

ClearSchedulerFactory is only used in test.

func GetLiveExecIDs ¶

func GetLiveExecIDs(ctx context.Context) ([]string, error)

GetLiveExecIDs returns all live executor node IDs.

func IsCancelledErr ¶

func IsCancelledErr(err error) bool

IsCancelledErr checks if the error is a cancelled error.

func RegisterSchedulerCleanUpFactory ¶

func RegisterSchedulerCleanUpFactory(taskType proto.TaskType, ctor cleanUpFactoryFn)

RegisterSchedulerCleanUpFactory is used to register the scheduler clean up factory. normally scheduler cleanup is used in the scheduler_manager gcTaskLoop to do clean up works when tasks are finished.

func RegisterSchedulerFactory ¶

func RegisterSchedulerFactory(taskType proto.TaskType, ctor schedulerFactoryFn)

RegisterSchedulerFactory is used to register the scheduler factory. normally scheduler ctor should be registered before the server start. and should be called in a single routine, such as in init(). after the server start, there's should be no write to the map. but for index backfill, the register call stack is so deep, not sure if it's safe to do so, so we use a lock here.

func VerifyTaskStateTransform ¶

func VerifyTaskStateTransform(from, to proto.TaskState) bool

VerifyTaskStateTransform verifies whether the task state transform is valid.

Types ¶

type BaseScheduler ¶

type BaseScheduler struct {
	Param

	// when RegisterSchedulerFactory, the factory MUST initialize this fields.
	Extension
	// contains filtered or unexported fields
}

BaseScheduler is the base struct for Scheduler. each task type embed this struct and implement the Extension interface.

func NewBaseScheduler ¶

func NewBaseScheduler(ctx context.Context, task *proto.Task, param Param) *BaseScheduler

NewBaseScheduler creates a new BaseScheduler.

func (*BaseScheduler) Close ¶

func (*BaseScheduler) Close()

Close closes the scheduler.

func (*BaseScheduler) GetPreviousSubtaskMetas ¶

func (s *BaseScheduler) GetPreviousSubtaskMetas(taskID int64, step proto.Step) ([][]byte, error)

GetPreviousSubtaskMetas get subtask metas from specific step.

func (*BaseScheduler) GetTask ¶

func (s *BaseScheduler) GetTask() *proto.Task

GetTask implements the Scheduler interface.

func (*BaseScheduler) Init ¶

func (*BaseScheduler) Init() error

Init implements the Scheduler interface.

func (*BaseScheduler) ScheduleTask ¶

func (s *BaseScheduler) ScheduleTask()

ScheduleTask implements the Scheduler interface.

func (*BaseScheduler) WithNewSession ¶

func (s *BaseScheduler) WithNewSession(fn func(se sessionctx.Context) error) error

WithNewSession executes the function with a new session.

func (*BaseScheduler) WithNewTxn ¶

func (s *BaseScheduler) WithNewTxn(ctx context.Context, fn func(se sessionctx.Context) error) error

WithNewTxn executes the fn in a new transaction.

type CleanUpRoutine ¶

type CleanUpRoutine interface {
	// CleanUp do the cleanup work.
	// task.Meta can be updated here, such as redacting some sensitive info.
	CleanUp(ctx context.Context, task *proto.Task) error
}

CleanUpRoutine is used for the framework to do some clean up work if the task is finished.

type Extension ¶

type Extension interface {
	// OnTick is used to handle the ticker event, if business impl need to do some periodical work, you can
	// do it here, but don't do too much work here, because the ticker interval is small, and it will block
	// the event is generated every CheckTaskRunningInterval, and only when the task NOT FINISHED and NO ERROR.
	OnTick(ctx context.Context, task *proto.Task)

	// OnNextSubtasksBatch is used to generate batch of subtasks for next stage
	// NOTE: don't change task.State inside, framework will manage it.
	// it's called when:
	// 	1. task is pending and entering it's first step.
	// 	2. subtasks scheduled has all finished with no error.
	// when next step is StepDone, it should return nil, nil.
	OnNextSubtasksBatch(ctx context.Context, h storage.TaskHandle, task *proto.Task, execIDs []string, step proto.Step) (subtaskMetas [][]byte, err error)

	// OnDone is called when task is done, either finished successfully or failed
	// with error.
	// if the task is failed when initializing scheduler, or it's an unknown task,
	// we don't call this function.
	OnDone(ctx context.Context, h storage.TaskHandle, task *proto.Task) error

	// GetEligibleInstances is used to get the eligible instances for the task.
	// on certain condition we may want to use some instances to do the task, such as instances with more disk.
	// if returned instances is empty, it means all instances are eligible.
	// TODO: run import from server disk using framework makes this logic complicated,
	// the instance might not be managed by framework.
	GetEligibleInstances(ctx context.Context, task *proto.Task) ([]string, error)

	// IsRetryableErr is used to check whether the error occurred in scheduler is retryable.
	IsRetryableErr(err error) bool

	// GetNextStep is used to get the next step for the task.
	// if task runs successfully, it should go from StepInit to business steps,
	// then to StepDone, then scheduler will mark it as finished.
	// NOTE: don't depend on task meta to decide the next step, if it's really needed,
	// initialize required fields on scheduler.Init
	GetNextStep(task *proto.TaskBase) proto.Step
}

Extension is used to control the process operations for each task. it's used to extend functions of BaseScheduler. as golang doesn't support inheritance, we embed this interface in Scheduler to simulate abstract method as in other OO languages.

func GetTestSchedulerExt ¶

func GetTestSchedulerExt(ctrl *gomock.Controller) Extension

GetTestSchedulerExt return scheduler.Extension for testing.

type Manager ¶

type Manager struct {
	// contains filtered or unexported fields
}

Manager manage a bunch of schedulers. Scheduler schedule and monitor tasks. The scheduling task number is limited by size of gPool.

func NewManager ¶

func NewManager(ctx context.Context, taskMgr TaskManager, serverID string) *Manager

NewManager creates a scheduler struct.

func (*Manager) Cancel ¶

func (sm *Manager) Cancel()

Cancel cancels the scheduler manager. used in test to simulate tidb node shutdown.

func (*Manager) Initialized ¶

func (sm *Manager) Initialized() bool

Initialized check the manager initialized.

func (*Manager) MockScheduler ¶

func (sm *Manager) MockScheduler(task *proto.Task) *BaseScheduler

MockScheduler mock one scheduler for one task, only used for tests.

func (*Manager) Start ¶

func (sm *Manager) Start()

Start the schedulerManager, start the scheduleTaskLoop to start multiple schedulers.

func (*Manager) Stop ¶

func (sm *Manager) Stop()

Stop the schedulerManager.

type NodeManager ¶

type NodeManager struct {
	// contains filtered or unexported fields
}

NodeManager maintains live TiDB nodes in the cluster, and maintains the nodes managed by the framework.

type Param ¶

type Param struct {
	// contains filtered or unexported fields
}

Param is used to pass parameters when creating scheduler.

type Scheduler ¶

type Scheduler interface {
	// Init initializes the scheduler, should be called before ExecuteTask.
	// if Init returns error, scheduler manager will fail the task directly,
	// so the returned error should be a fatal error.
	Init() error
	// ScheduleTask schedules the task execution step by step.
	ScheduleTask()
	// Close closes the scheduler, should be called if Init returns nil.
	Close()
	// GetTask returns the task that the scheduler is managing.
	GetTask() *proto.Task
	Extension
}

Scheduler manages the lifetime of a task including submitting subtasks and updating the status of a task.

type SlotManager ¶

type SlotManager struct {
	// contains filtered or unexported fields
}

SlotManager is used to manage the resource slots and stripes.

Slot is the resource unit of dist framework on each node, each slot represents 1 cpu core, 1/total-core of memory, 1/total-core of disk, etc.

Stripe is the resource unit of dist framework, regardless of the node, each stripe means 1 slot on all nodes managed by dist framework. Number of stripes is equal to number of slots on each node, as we assume that all nodes managed by dist framework are isomorphic. Stripes reserved for a task defines the maximum resource that a task can use but the task might not use all the resources. To maximize the resource utilization, we will try to schedule as many tasks as possible depends on the used slots on each node and the minimum resource required by the tasks, and in this case, we don't consider task order.

Dist framework will try to allocate resource by slots and stripes, and give quota to subtask, but subtask can determine what to conform.

type TaskManager ¶

type TaskManager interface {
	// GetTopUnfinishedTasks returns unfinished tasks, limited by MaxConcurrentTask*2,
	// to make sure lower rank tasks can be scheduled if resource is enough.
	// The returned tasks are sorted by task order, see proto.Task.
	GetTopUnfinishedTasks(ctx context.Context) ([]*proto.TaskBase, error)
	// GetAllSubtasks gets all subtasks with basic columns.
	GetAllSubtasks(ctx context.Context) ([]*proto.SubtaskBase, error)
	GetTasksInStates(ctx context.Context, states ...any) (task []*proto.Task, err error)
	GetTaskByID(ctx context.Context, taskID int64) (task *proto.Task, err error)
	GetTaskBaseByID(ctx context.Context, taskID int64) (task *proto.TaskBase, err error)
	GCSubtasks(ctx context.Context) error
	GetAllNodes(ctx context.Context) ([]proto.ManagedNode, error)
	DeleteDeadNodes(ctx context.Context, nodes []string) error
	// TransferTasks2History transfer tasks, and it's related subtasks to history tables.
	TransferTasks2History(ctx context.Context, tasks []*proto.Task) error
	// CancelTask updated task state to canceling.
	CancelTask(ctx context.Context, taskID int64) error
	// FailTask updates task state to Failed and updates task error.
	FailTask(ctx context.Context, taskID int64, currentState proto.TaskState, taskErr error) error
	// RevertTask updates task state to reverting, and task error.
	RevertTask(ctx context.Context, taskID int64, taskState proto.TaskState, taskErr error) error
	// RevertedTask updates task state to reverted.
	RevertedTask(ctx context.Context, taskID int64) error
	// PauseTask updated task state to pausing.
	PauseTask(ctx context.Context, taskKey string) (bool, error)
	// PausedTask updated task state to paused.
	PausedTask(ctx context.Context, taskID int64) error
	// ResumedTask updated task state from resuming to running.
	ResumedTask(ctx context.Context, taskID int64) error
	// SucceedTask updates a task to success state.
	SucceedTask(ctx context.Context, taskID int64) error
	// SwitchTaskStep switches the task to the next step and add subtasks in one
	// transaction. It will change task state too if we're switch from InitStep to
	// next step.
	SwitchTaskStep(ctx context.Context, task *proto.Task, nextState proto.TaskState, nextStep proto.Step, subtasks []*proto.Subtask) error
	// SwitchTaskStepInBatch similar to SwitchTaskStep, but it will insert subtasks
	// in batch, and task step change will be in a separate transaction.
	// Note: subtasks of this step must be stable, i.e. count, order and content
	// should be the same on each try, else the subtasks inserted might be messed up.
	// And each subtask of this step must be different, to handle the network
	// partition or owner change.
	SwitchTaskStepInBatch(ctx context.Context, task *proto.Task, nextState proto.TaskState, nextStep proto.Step, subtasks []*proto.Subtask) error
	// GetUsedSlotsOnNodes returns the used slots on nodes that have subtask scheduled.
	// subtasks of each task on one node is only accounted once as we don't support
	// running them concurrently.
	// we only consider pending/running subtasks, subtasks related to revert are
	// not considered.
	GetUsedSlotsOnNodes(ctx context.Context) (map[string]int, error)
	// GetActiveSubtasks returns subtasks of the task that are in pending/running state.
	GetActiveSubtasks(ctx context.Context, taskID int64) ([]*proto.SubtaskBase, error)
	// GetSubtaskCntGroupByStates returns the count of subtasks of some step group by state.
	GetSubtaskCntGroupByStates(ctx context.Context, taskID int64, step proto.Step) (map[proto.SubtaskState]int64, error)
	ResumeSubtasks(ctx context.Context, taskID int64) error
	GetSubtaskErrors(ctx context.Context, taskID int64) ([]error, error)
	UpdateSubtasksExecIDs(ctx context.Context, subtasks []*proto.SubtaskBase) error

	// GetAllSubtasksByStepAndState gets all subtasks by given states for one step.
	GetAllSubtasksByStepAndState(ctx context.Context, taskID int64, step proto.Step, state proto.SubtaskState) ([]*proto.Subtask, error)

	WithNewSession(fn func(se sessionctx.Context) error) error
	WithNewTxn(ctx context.Context, fn func(se sessionctx.Context) error) error
}

TaskManager defines the interface to access task table.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
mock Package mock is a generated GoMock package.	Package mock is a generated GoMock package.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL