planner

package
v0.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 25, 2023 License: MIT Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Optimize

func Optimize(s *stream.Stream, catalog *database.Catalog) (*stream.Stream, error)

Optimize takes a tree, applies a list of optimization rules and returns an optimized tree. Depending on the rule, the tree may be modified in place or replaced by a new one.

func PrecalculateExprRule

func PrecalculateExprRule(sctx *StreamContext) error

PrecalculateExprRule evaluates any constant sub-expression that can be evaluated before running the query and replaces it by the result of the evaluation. The result of constant sub-expressions, like "3 + 4", is always the same and thus can be precalculated. Examples:

3 + 4 --> 7
3 + 1 > 10 - a --> 4 > 10 - a

func RemoveUnnecessaryFilterNodesRule

func RemoveUnnecessaryFilterNodesRule(sctx *StreamContext) error

RemoveUnnecessaryFilterNodesRule removes any filter node whose condition is a constant expression that evaluates to a truthy value. if it evaluates to a falsy value, it considers that the tree will not stream any object, so it returns an empty tree.

func RemoveUnnecessaryProjection

func RemoveUnnecessaryProjection(sctx *StreamContext) error

RemoveUnnecessaryProjection removes any project node whose expression is a wildcard only.

func RemoveUnnecessaryTempSortNodesRule

func RemoveUnnecessaryTempSortNodesRule(sctx *StreamContext) error

RemoveUnnecessaryTempSortNodesRule removes any duplicate TempSort node. For each stream, there can be at most two TempSort nodes. In the following case, we can remove the second TempSort node.

SELECT * FROM foo GROUP BY a ORDER BY a
table.Scan('foo') | docs.TempSort(a) | docs.GroupBy(a) | docs.TempSort(a)

This only works if both temp sort nodes use the same path

func SelectIndex

func SelectIndex(sctx *StreamContext) error

SelectIndex attempts to replace a sequential scan by an index scan or a pk scan by analyzing the stream for indexable filter nodes. It expects the first node of the stream to be a table.Scan.

Compatibility of filter nodes.

For a filter node to be selected if must be of the following form:

<path> <compatible operator> <expression>

or

<expression> <compatible operator> <path>

path: path of an object compatible operator: one of =, >, >=, <, <=, IN expression: any expression

Index compatibility.

Once we have a list of all compatible filter nodes, we try to associate indexes with them. Given the following index:

CREATE INDEX foo_a_idx ON foo (a)

and this query:

SELECT * FROM foo WHERE a > 5 AND b > 10
table.Scan('foo') | rows.Filter(a > 5) | rows.Filter(b > 10) | rows.Project(*)

foo_a_idx matches rows.Filter(a > 5) and can be selected. Now, with a different index:

CREATE INDEX foo_a_b_c_idx ON foo(a, b, c)

and this query:

SELECT * FROM foo WHERE a > 5 AND c > 20
table.Scan('foo') | rows.Filter(a > 5) | rows.Filter(c > 20) | rows.Project(*)

foo_a_b_c_idx matches with the first filter because a is the leftmost path indexed by it. The second filter is not selected because it is not the second leftmost path. For composite indexes, filter nodes can be selected if they match with one or more indexed path consecutively, from left to right. Now, let's have a look a this query:

SELECT * FROM foo WHERE a = 5 AND b = 10 AND c > 15 AND d > 20
table.Scan('foo') | rows.Filter(a = 5) | rows.Filter(b = 10) | rows.Filter(c > 15) | rows.Filter(d > 20) | rows.Project(*)

foo_a_b_c_idx matches with first three filters because they satisfy several conditions: - each of them matches with the first 3 indexed paths, consecutively. - the first 2 filters use the equal operator A counter-example:

SELECT * FROM foo WHERE a = 5 AND b > 10 AND c > 15 AND d > 20
table.Scan('foo') | rows.Filter(a = 5) | rows.Filter(b > 10) | rows.Filter(c > 15) | rows.Filter(d > 20) | rows.Project(*)

foo_a_b_c_idx only matches with the first two filter nodes because while the first node uses the equal operator, the second one doesn't, and thus the third node cannot be selected as well.

Candidates and cost

Because a table can have multiple indexes, we need to establish which of these indexes should be used to run the query, if not all of them. For that we generate a cost for each selected index and return the one with the cheapest cost.

func SplitANDConditionRule

func SplitANDConditionRule(sctx *StreamContext) error

SplitANDConditionRule splits any filter node whose condition is one or more AND operators into one or more filter nodes. The condition won't be split if the expression tree contains an OR operation. Example:

this:
  rows.Filter(a > 2 AND b != 3 AND c < 2)
becomes this:
  rows.Filter(a > 2)
  rows.Filter(b != 3)
  rows.Filter(c < 2)

Types

type StreamContext

type StreamContext struct {
	Catalog       *database.Catalog
	Stream        *stream.Stream
	Filters       []*rows.FilterOperator
	Projections   []*rows.ProjectOperator
	TempTreeSorts []*rows.TempTreeSortOperator
}

func NewStreamContext

func NewStreamContext(s *stream.Stream) *StreamContext

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL