table_function

package
v1.0.0-rc2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 24, 2023 License: Apache-2.0 Imports: 32 Imported by: 0

README

UNNEST

函数说明

UNNEST是一个表函数,出现在 SQL 的 from 子句中,用于将 json[object|array]类型的数据展开为多行,每行包含json中的一个元素.

语法结构

> UNNEST(src[, path[, outer]])

相关参数

参数 说明 类型
src 必要参数,待展开的数据源 类型可以是 json 列或 json 字符串
path 可选参数,指明待展开数据源的具体 json 路径。默认为"$",展开整个 json 数据 path 字符串
outer 可选参数,如果数据源展开后结果行为 0,是否加上一个空行作为标记。默认为 false bool 类型

返回表结构

字段名 类型 说明
col varchar 数据源的名称。如果数据源是列,则是列名;如果数据源是 json 字符串,则是"UNNEST_DEFAULT"
seq int32 数据源中元素的索引,从 0 开始
key varchar 展开元素的键名,如果父级元素是数组,则为 null
path varchar 展开元素的在数据源中的路径
index int32 展开元素在父级元素中的索引,如果数据源是对象,则为 null
value varchar 展开元素的值
this varchar 展开元素的父级元素值

示例

> select *
> from unnest('{"a":1,"b":2,"c":3}') as u;
+----------------+------+------+------+-------+-------+--------------------------+
| col            | seq  | key  | path | index | value | this                     |
+----------------+------+------+------+-------+-------+--------------------------+
| UNNEST_DEFAULT |    0 | a    | $.a  |  NULL | 1     | {"a": 1, "b": 2, "c": 3} |
| UNNEST_DEFAULT |    0 | b    | $.b  |  NULL | 2     | {"a": 1, "b": 2, "c": 3} |
| UNNEST_DEFAULT |    0 | c    | $.c  |  NULL | 3     | {"a": 1, "b": 2, "c": 3} |
+----------------+------+------+------+-------+-------+--------------------------+

> select *
> from unnest('{"a":1,"b":2,"c":3}') as u
> where u.`key` = 'b';
+----------------+------+------+------+-------+-------+--------------------------+
| col            | seq  | key  | path | index | value | this                     |
+----------------+------+------+------+-------+-------+--------------------------+
| UNNEST_DEFAULT |    0 | b    | $.b  |  NULL | 2     | {"a": 1, "b": 2, "c": 3} |
+----------------+------+------+------+-------+-------+--------------------------+

> select *
> from unnest('{"a":1,"b":2,"c":3}',"$.b") as u;
Empty set (0.01 sec)

> select *
> from unnest('{"a":1,"b":2,"c":3}',"$.b",true) as u;
+----------------+------+------+------+-------+-------+--------------------------+
| col            | seq  | key  | path | index | value | this                     |
+----------------+------+------+------+-------+-------+--------------------------+
| UNNEST_DEFAULT |    0 | NULL | $.b  |  NULL | NULL  | 2                        |

> drop table if exists t1;
> create table t1 (a json,b int);
> insert into t1
> values ('{"a":1,"b":[{"c":2,"d":3},false,4],"e":{"f":true,"g":[null,true,1.1]}}',1);
> insert into t1
> values ('[1,true,false,null,"aaa",1.1,{"t":false}]',2);
> select * from unnest(t1.a, "$.b") as u;
+------+------+------+--------+-------+------------------+------------------------------+
| col  | seq  | key  | path   | index | value            | this                         |
+------+------+------+--------+-------+------------------+------------------------------+
| a    |    0 | NULL | $.b[0] |     0 | {"c": 2, "d": 3} | [{"c": 2, "d": 3}, false, 4] |
| a    |    0 | NULL | $.b[1] |     1 | false            | [{"c": 2, "d": 3}, false, 4] |
| a    |    0 | NULL | $.b[2] |     2 | 4                | [{"c": 2, "d": 3}, false, 4] |
+------+------+------+--------+-------+------------------+------------------------------+

> select * from unnest(t1.a, "$.b[0]") as u;
+------+------+------+----------+-------+-------+------------------+
| col  | seq  | key  | path     | index | value | this             |
+------+------+------+----------+-------+-------+------------------+
| a    |    0 | c    | $.b[0].c |  NULL | 2     | {"c": 2, "d": 3} |
| a    |    0 | d    | $.b[0].d |  NULL | 3     | {"c": 2, "d": 3} |
+------+------+------+----------+-------+-------+------------------+

> select distinct(f.seq) from unnest(t1.a, "$") as f;
+-------+
| f.seq |
+-------+
|     0 |
|     1 |
+-------+

注意事项

  • key,index 和 value 全为 null 则代表当前行是 outer 为 true 时默认添加的空行

执行流程

  1. 数据源是 json 列 ...unnest -> project(jsonCol) -> tableScan
  2. 数据源是 json 字符串 ...unnest -> project(default) -> valueScan(parse jsonStr)

实现细节

数据源为 json 字符串
  1. 构建 plan 时将存储在tree.Unnest中的参数序列化后存储到unnestNode.TableDef.TableFunctionParam
  2. unnestNode中添加valueScan节点
  3. valueScan.TableDef.TableFunctionParam赋值为tree.Unnest中存储的 json 字符串转化的字节切片
  4. 在编译阶段首先将valueScanTableDef.TableFunctionParam存入scope.Datasource.Bat
  5. 在 scope 中添加vm.Unnest指令,并通过unnestNode.TableDef.TableFunctionParam构建运行参数
  6. 执行阶段通过bytejson包解析 json 字节切片,解析 path 字符串,通过 json,path, outer,filter参数调用bytejson.Unnest函数,返回UnnestResult结果集
  7. 通过makeBatch组装UnnestResult结果集为batch
数据源为 json 列
  1. 构建 plan 时将存储在tree.Unnest中的参数序列化后存储到unnestNode.TableDef.TableFunctionParam
  2. unnestNode中添加tableScan节点,并根据tree.Unnest中的参数初始化tableScanTableDef
  3. 编译阶段在 scope 中添加vm.Unnest指令,并通过unnestNode.TableDef.TableFunctionParam构建运行参数
  4. 执行阶段通过bytejson包解析由tableScan传递 bytejson 字节切片,解析 path 字符串,通过 json,path, outer,filter参数调用bytejson.Unnest 函数,返回UnnestResult结果集
  5. 通过makeBatch组装UnnestResult结果集为batch

filter参数是根据tree.Unnest中的Attrs字段构建的 string 切片,其目的是为了在bytejson.Unnest函数中过滤不需要的结果集

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Call

func Call(idx int, proc *process.Process, arg any, isFirst bool, isLast bool) (process.ExecStatus, error)

func Prepare

func Prepare(proc *process.Process, arg any) error

func String

func String(arg any, buf *bytes.Buffer)

Types

type Argument

type Argument struct {
	Rets   []*plan.ColDef
	Args   []*plan.Expr
	Attrs  []string
	Params []byte
	Name   string
	// contains filtered or unexported fields
}

func (*Argument) Free

func (arg *Argument) Free(proc *process.Process, pipelineFailed bool, err error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL