maxcompute

package
v0.11.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2024 License: Apache-2.0 Imports: 22 Imported by: 0

README

maxcompute

Usage

The maxcompute extractor allows you to extract metadata from MaxCompute tables and schemas. It supports configuration for project name, endpoint, access keys, schema name, exclusions, and concurrency.

source:
    name: maxcompute
    config:
        project_name: goto_test
        endpoint_project: http://goto_test-maxcompute.com
        access_key:
            id: access_key_id
            secret: access_key_secret
        schema_name: DEFAULT
        exclude:
            schemas:
                - schema_a
                - schema_b
            tables:
                - schema_c.table_a
        concurrency: 10

Inputs

Key Value Example Description
project_name string goto_test MaxCompute Project Name required
endpoint_project string http://goto_test-maxcompute.com Endpoint Project URL required
access_key.id string access_key_id Access Key ID required
access_key.secret string access_key_secret Access Key Secret required
schema_name string DEFAULT Default schema name optional
exclude.schemas []string ["schema_a", "schema_b"] List of schemas to exclude optional
exclude.tables []string ["schema_c.table_a"] List of tables to exclude optional
concurrency int 10 Number of concurrent requests to MaxCompute optional
Notes

Outputs

Field Sample Value Description
resource.urn project_name.schema_name.table_name
resource.name table_name
resource.service maxcompute
description table description
schema []Column
properties.partition_data "partition_data": {"partition_field": "data_date", "require_partition_filter": false, "time_partition": {"partition_by": "DAY","partition_expire": 0 } } partition related data for time and range partitioning.
properties.partition_field created_at returns the field on which table is time partitioned
Partition Data
Field Sample Value Description
partition_field created_at field on which the table is partitioned either by TimePartitioning or RangePartitioning. In case field is empty for TimePartitioning _PARTITIONTIME is returned instead of empty.
require_partition_filter true boolean value which denotes if every query on the MaxCompute table must include at least one predicate that only references the partitioning column
time_partition.partition_by HOUR returns partition type HOUR/DAY/MONTH/YEAR
time_partition.partition_expire_seconds 0 time in which data will expire from this partition. If 0 it will not expire.
range_partition.interval 10 width of a interval range
range_partition.start 0 start value for partition inclusive of this value
range_partition.end 100 end value for partition exclusive of this value
Column
Field Sample Value
name total_price
description item's total price
data_type decimal
is_nullable true
Join
Field Sample Value
urn project_name.schema_name.table_name
count 3
conditions ["ON target.column_1 = source.column_1 and target.param_name = source.param_name","ON DATE(target.event_timestamp) = DATE(source.event_timestamp)"]

Contributing

Refer to the contribution guidelines for information on contributing to this module.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Client

type Client interface {
	ListSchema(ctx context.Context) ([]*odps.Schema, error)
	ListTable(ctx context.Context, schemaName string) ([]*odps.Table, error)
	GetTableSchema(ctx context.Context, table *odps.Table) (string, *tableschema.TableSchema, error)
	GetTablePreview(ctx context.Context, partitionValue string, table *odps.Table, maxRows int) ([]string, *structpb.ListValue, error)
}

func CreateClient

func CreateClient(_ context.Context, _ log.Logger, conf config.Config) (Client, error)

type Extractor

type Extractor struct {
	plugins.BaseExtractor
	// contains filtered or unexported fields
}

func New

func New(logger log.Logger, clientFunc NewClientFunc, randFn randFn) *Extractor

func (*Extractor) Extract

func (e *Extractor) Extract(ctx context.Context, emit plugins.Emit) error

func (*Extractor) Init

func (e *Extractor) Init(ctx context.Context, conf plugins.Config) error

type NewClientFunc

type NewClientFunc func(ctx context.Context, logger log.Logger, conf config.Config) (Client, error)

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL