scheduler

package

v0.0.0-...-e32b5e2 Latest Latest Go to latest Published: Nov 11, 2024 License: AGPL-3.0 Imports: 36 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/grafana/mimir

README ¶

Query Scheduler

What is the Query Scheduler?

The Query Scheduler is a dedicated request queue broker component between the Query Frontend instances and the pool of Queriers available to service the queries.

The Query Frontend instances split and shard query requests, then enqueues the requests to the Scheduler.
The Scheduler pushes the queries into its internal in-memory RequestQueue process.
Queriers connect to the Scheduler and enter a loop to continually request their next query to process.
Queriers also maintain connections to the Query Frontend to send query results directly back to the Frontend.
The Scheduler does not receive query results, but will report an error for the query to the Frontend if the connection to the Querier is broken during query processing, so the Frontend can handle the failure without waiting for the full context timeout.

Why Do We Need the Query Scheduler?

A dedicated queue component solves scalability issues which can arise from the usage of the "v1 Frontend", in which the RequestQueue process was embedded in each Query Frontend instance, and Queriers connected directly to the Frontends to request their next query to process.

Query Frontend Resource Requirements

The Scheduler has limited responsibilities and low resource usage compared to the Query Frontend. The Query Frontend is a complex component which must parse requests into Prometheus queries, split, shard, and rewrite requests, and then merge large result sets from Queriers back together.

The Query Frontend therefore has higher baseline needs for its horizontal scale, and often needs to further autoscale due to increase query count or complexity. This need for higher pod counts introduces connection scaling issues with the Queriers.

Querier Connection Scaling

Each of N active Querier pods maintains at least one connection to each of M component pods dispatching queries, whether that dispatching component is the Query Frontend or the Query Scheduler.

When Mimir is deployed using the v1 Frontend, the Queriers connect directly to the Frontends. The horizontal scale of both Frontends and Queriers can quickly create a very large number of connections to maintain. A Mimir deployment may have 2 Frontends and 20 Queriers at idle and scale quickly up to 10 Frontends and 100 Queriers. In response, each Querier must increase its number of concurrent Querier-Worker goroutines in order to maintain the minimum connections to each available Frontend.

Each of those Querier-Worker goroutines does full end-to-end data fetching and query processing. Because the Querier does not have a robust internal queuing or rate-limiting mechanism, excessive concurrency can quickly cause issues as resource usage spikes and runs up against limits. The system has no ability to react to those queries stuck or slowed down in the Querier other than waiting for completion, cancellation, or timeout.

In contrast, a Mimir deployment of any size generally only uses 2 Scheduler pods, and the Scheduler does not utilize horizontal autoscaling. Each Querier can therefore maintain fewer connections to receive queries and is far less likely to experience issues with excessive concurrency.

Querier-Scheduler Connection Lifecycle

The Querier uses service discovery to track when a Scheduler instance is available. It creates a Querier-Worker connections to the Scheduler, each of which which begins the QuerierLoop, which is the core method that triggers changes to the state of Querier-Worker connections to the Scheduler.

The Scheduler tracks Querier-Worker connections by Querier ID. This connection tracking serves two purposes:

Adding and removing Queriers from the Tenant-Querier shuffle sharding when the first worker connection from a Querier is added or when the last connection is removed.
Enabling graceful shutdowns for both Scheduler and Querier (see below).

Querier Shutdown

There are two ways for the persistent Querier-Worker connections to disconnect from the Scheduler. Both utilize the control flow in the QuerierLoop rpc between Querier-Workers and the Scheduler, triggering state updates when the loop begins and using defers to perform cleanup tasks when the loop disconnects.

Querier-Worker Disconnection Process 1: Errors in QuerierLoop

If any part of the Scheduler's QuerierLoop errors out for any reason, it triggers defers which deregister the Querier-Worker connection from the queue. The errors captured by this process are generally unexpected, such as broken or timed-out connections.

The defer calls in the Scheduler's QuerierLoop tell the RequestQueue to decrement the connection count for that Querier ID.
If the Querier no longer has any active worker connections and querier-forget-delay is not enabled, the Querier will be immediately deregistered from the RequestQueue, triggering a shuffle-shard update.
Otherwise, if the Querier no longer has any worker connections and querier-forget-delay is enabled, then the RequestQueue notes the time the Querier was disconnected at but does not deregister it yet.
The RequestQueue periodically calls a forgetDisconnectedQueriers which will deregister all Queriers with no remaining connections whose disconnectedAt exceeds the querier-forget-delay grace period, then trigger shuffle-shard updates.
If the Querier reconnects during the grace period, the disconnectedAt state for the Querier is cleared and it will not be affected by the calls to forgetDisconnectedQueriers.

Querier-Worker Disconnection Process 2: Graceful Shutdown Notification from Querier

The top-level Querier process calls a NotifyShutdown rpc on the Scheduler to tell the Scheduler that the Querier is shutting down. This is the expected behavior we see when Queriers are shut down as part of normal rollout or downscaling procedures. With minor differences, this process utilizes the same mechanisms as the error-case process described above.

The Scheduler tells the RequestQueue to mark the connection state as shuttingDown for the given Querier ID.
All worker connections from that Querier ID in the QuerierLoop will receive an ErrQuerierShuttingDown from the RequestQueue when they ask to dequeue their next query request.
The rest of the lifecycle is handled by the logic in the "Errors in QuerierLoop" process described above, with the exception that querier-forget-delay is ignored when the Querier is in a shuttingDown state. The Querier is removed as soon as the last Querier-Worker connection is deregistered, and shuffle-shard updates are triggered.

Documentation ¶

Index ¶

type Config
- func (cfg *Config) RegisterFlags(f *flag.FlagSet, logger log.Logger)
- func (cfg *Config) Validate() error
type Limits
type Scheduler
- func NewScheduler(cfg Config, limits Limits, log log.Logger, registerer prometheus.Registerer) (*Scheduler, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶

type Config struct {
	MaxOutstandingPerTenant int           `yaml:"max_outstanding_requests_per_tenant"`
	QuerierForgetDelay      time.Duration `yaml:"querier_forget_delay" category:"experimental"`

	GRPCClientConfig grpcclient.Config         `yaml:"grpc_client_config" doc:"description=This configures the gRPC client used to report errors back to the query-frontend."`
	ServiceDiscovery schedulerdiscovery.Config `yaml:",inline"`
}

func (*Config) RegisterFlags ¶

func (cfg *Config) RegisterFlags(f *flag.FlagSet, logger log.Logger)

func (*Config) Validate ¶

func (cfg *Config) Validate() error

type Limits ¶

type Limits interface {
	// MaxQueriersPerUser returns max queriers to use per tenant, or 0 if shuffle sharding is disabled.
	MaxQueriersPerUser(user string) int
}

Limits needed for the Query Scheduler - interface used for decoupling.

type Scheduler ¶

type Scheduler struct {
	services.Service
	// contains filtered or unexported fields
}

Scheduler is responsible for queueing and dispatching queries to Queriers.

func NewScheduler ¶

func NewScheduler(cfg Config, limits Limits, log log.Logger, registerer prometheus.Registerer) (*Scheduler, error)

NewScheduler creates a new Scheduler.

func (*Scheduler) FrontendLoop ¶

func (s *Scheduler) FrontendLoop(frontend schedulerpb.SchedulerForFrontend_FrontendLoopServer) error

FrontendLoop handles connection from frontend.

func (*Scheduler) NotifyQuerierShutdown ¶

func (s *Scheduler) NotifyQuerierShutdown(ctx context.Context, req *schedulerpb.NotifyQuerierShutdownRequest) (*schedulerpb.NotifyQuerierShutdownResponse, error)

func (*Scheduler) QuerierLoop ¶

func (s *Scheduler) QuerierLoop(querier schedulerpb.SchedulerForQuerier_QuerierLoopServer) error

QuerierLoop is started by querier to receive queries from scheduler.

func (*Scheduler) RingHandler ¶

func (s *Scheduler) RingHandler(w http.ResponseWriter, req *http.Request)

Source Files ¶

View all Source files

scheduler.go

Directories ¶

Path	Synopsis
queue
tree
schedulerdiscovery
schedulerpb

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL