mysql

package
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 8, 2024 License: Apache-2.0 Imports: 7 Imported by: 0

README

MysqlReader Plugin Documentation

Quick Introduction

The MysqlReader plugin enables data reading from a MySQL database. Internally, MysqlReader connects to a remote MySQL database using github.com/go-sql-driver/mysql and database/sql, executing corresponding SQL statements to retrieve data from the MySQL server.

Implementation Principles

MysqlReader connects to a remote MySQL database using github.com/go-sql-driver/mysql. Based on user-provided configuration information, it generates SQL queries and sends them to the remote MySQL server. The returned results from these SQL executions are assembled into an abstract dataset using go-etl's custom data types and passed to downstream Writers for processing.

MysqlReader utilizes the query processes defined in dbmsreader and calls go-etl's custom storage/database DBWrapper for specific queries. DBWrapper encapsulates numerous interfaces from database/sql and abstracts out a database dialect. For MySQL, it adopts the dialect implemented in storage/database/mysql.

Functionality Description

Configuration Example

Configuring a job to synchronize data from a MySQL database to a local destination:

{
    "job":{
        "content":[
            {
                "reader":{
                    "name": "mysqlreader",
                    "parameter": {
                        "username": "root",
                        "password": "123456",
                        "column": ["*"],
                        "connection": {
                            "url": "tcp(192.168.0.1:3306)/mysql?parseTime=false",
                            "table": {
                                "db":"source",
                                "name":"type_table"
                            }
                        },
                        "split": {
                            "key":"id"
                        },
                        "where": "",
                        "querySql":["select a,b from table_a join table_b on table_a.id = table_b.id"]
                    }
                }
            }
        ]
    }
}
Parameter Explanation
url
  • Description: Primarily used to configure the connection information to the remote server. The basic configuration format is: tcp(ip:port)/db, where ip:port represents the IP address and port of the MySQL server, and db indicates the default database to connect to. It is similar to the connection configuration information of mysql, except that the username and password are extracted from the connection configuration for easier encryption.
  • Required: Yes
  • Default: None
username
  • Description: Primarily used to configure the MySQL database username.
  • Required: Yes
  • Default: None
password
  • Description: Primarily used to configure the MySQL database password.
  • Required: Yes
  • Default: None
table

Describes the MySQL table information.

db
  • Description: Primarily used to configure the database name of the MySQL table.
  • Required: Yes
  • Default: None
name
  • Description: Primarily used to configure the table name of the MySQL table.
  • Required: Yes
  • Default: None
column
  • Description: An array of column names from the configured table that need to be synchronized. Users can use the JSON array format to describe the field information. Using "*" represents selecting all columns by default, e.g., ["*"].

    Supports column pruning, meaning users can choose specific columns for export.

    Supports column reordering, allowing columns to be exported in an order different from the table schema.

    Supports constant configuration. Users need to follow the MySQL SQL syntax format: ["id", "table", "1", "'bazhen.csy'", "null", "to_char(a + 1)", "2.3", "true"]. Here, id is a regular column name, table is a column name containing reserved words, 1 is an integer constant, 'bazhen.csy' is a string constant, null is a null pointer, to_char(a + 1) is an expression, 2.3 is a floating-point number, and true is a boolean value.

  • Required: Yes

  • Default: None

split
key
  • Description: Primarily used to configure the split key for the MySQL table. The split key must be of type bigInt/string/time, assuming the data distribution based on the split key is uniform.
  • Required: No
  • Default: None
timeAccuracy
  • Description: Primarily used to configure the time split key for the MySQL table, mainly describing the smallest unit of time, such as day, minute, second, millisecond, microsecond, nanosecond.
  • Required: No
  • Default: None
range
type
  • Description: Primarily used to configure the default value type of the split key for the MySQL table. The value can be bigInt/string/time. This will check the type of the split key in the table, so please ensure the type is correct.
  • Required: No
  • Default: None
left
  • Description: Primarily used to configure the default minimum value of the split key for the MySQL table.
  • Required: No
  • Default: None
right
  • Description: Primarily used to configure the default maximum value of the split key for the MySQL table.
  • Required: No
  • Default: None
where
  • Description: Primarily used to configure the WHERE condition for the SELECT statement.
  • Required: No
  • Default: None
querySql
  • Description: In some business scenarios, the "where" configuration item is not sufficient to describe the filtering conditions. Users can use this configuration item to customize the filtering SQL. When users configure this item, the DataX system will ignore the "table", "column", and other configuration items, and directly use the content of this configuration item for data filtering. For example, if you need to perform a multi-table join and then synchronize the data, you can use select a, b from table_a join table_b on table_a.id = table_b.id. When the user configures querySql, MysqlReader directly ignores the configuration of table, column, and where conditions. The priority of querySql is higher than that of table, column, and where options.
  • Required: No
  • Default: None
trimChar
  • Description: Specifies whether to remove leading and trailing spaces for char types in MySQL.
  • Required: No
  • Default: false
Type Conversion

Currently, MysqlReader supports most MySQL types, but there are still some individual types that are not supported. Please check your types carefully.

Below is a conversion table for MysqlReader regarding MySQL types:

go-etl Type MySQL Data Type
bigInt int, tinyint, smallint, mediumint, bigint, year,unsigned int, unsigned bigint, unsigned smallint, unsigned tinyint
decimal float, double, decimal
string varchar, char, tinytext, text, mediumtext, longtext
time date, datetime, timestamp, time
bytes tinyblob, mediumblob, blob, longblob, varbinary, bit

Performance Report

To be tested.

Constraints and Limitations

Database Encoding Issues

Currently, only the utf8 character set is supported.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Job

type Job struct {
	*dbms.Job
}

Job

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader - A component or tool used for reading data from a source

func (*Reader) Job

func (r *Reader) Job() spireader.Job

Job - A unit of work or task to be performed

func (*Reader) ResourcesConfig

func (r *Reader) ResourcesConfig() *config.JSON

ResourcesConfig - Configuration for the resources used by a plugin

func (*Reader) Task

func (r *Reader) Task() spireader.Task

Task - A specific piece of work or operation within a larger context, often part of a Job

type Task

type Task struct {
	*dbms.Task
}

Task

func (*Task) StartRead

func (t *Task) StartRead(ctx context.Context, sender plugin.RecordSender) (err error)

StartRead

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL