MysqlReader Plugin Documentation
Quick Introduction
The MysqlReader plugin enables data reading from a MySQL database. Internally, MysqlReader connects to a remote MySQL database using github.com/go-sql-driver/mysql
and database/sql
, executing corresponding SQL statements to retrieve data from the MySQL server.
Implementation Principles
MysqlReader connects to a remote MySQL database using github.com/go-sql-driver/mysql
. Based on user-provided configuration information, it generates SQL queries and sends them to the remote MySQL server. The returned results from these SQL executions are assembled into an abstract dataset using go-etl's custom data types and passed to downstream Writers for processing.
MysqlReader utilizes the query processes defined in dbmsreader
and calls go-etl's custom storage/database
DBWrapper for specific queries. DBWrapper encapsulates numerous interfaces from database/sql
and abstracts out a database dialect. For MySQL, it adopts the dialect implemented in storage/database/mysql
.
Functionality Description
Configuration Example
Configuring a job to synchronize data from a MySQL database to a local destination:
{
"job":{
"content":[
{
"reader":{
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "123456",
"column": ["*"],
"connection": {
"url": "tcp(192.168.0.1:3306)/mysql?parseTime=false",
"table": {
"db":"source",
"name":"type_table"
}
},
"split": {
"key":"id"
},
"where": "",
"querySql":["select a,b from table_a join table_b on table_a.id = table_b.id"]
}
}
}
]
}
}
Parameter Explanation
url
- Description: Primarily used to configure the connection information to the remote server. The basic configuration format is:
tcp(ip:port)/db
, where ip:port
represents the IP address and port of the MySQL server, and db
indicates the default database to connect to. It is similar to the connection configuration information of mysql, except that the username and password are extracted from the connection configuration for easier encryption.
- Required: Yes
- Default: None
username
- Description: Primarily used to configure the MySQL database username.
- Required: Yes
- Default: None
password
- Description: Primarily used to configure the MySQL database password.
- Required: Yes
- Default: None
table
Describes the MySQL table information.
db
- Description: Primarily used to configure the database name of the MySQL table.
- Required: Yes
- Default: None
name
- Description: Primarily used to configure the table name of the MySQL table.
- Required: Yes
- Default: None
column
-
Description: An array of column names from the configured table that need to be synchronized. Users can use the JSON array format to describe the field information. Using "*" represents selecting all columns by default, e.g., ["*"]
.
Supports column pruning, meaning users can choose specific columns for export.
Supports column reordering, allowing columns to be exported in an order different from the table schema.
Supports constant configuration. Users need to follow the MySQL SQL syntax format: ["id", "
table", "1", "'bazhen.csy'", "null", "to_char(a + 1)", "2.3", "true"]
. Here, id
is a regular column name, table
is a column name containing reserved words, 1
is an integer constant, 'bazhen.csy'
is a string constant, null
is a null pointer, to_char(a + 1)
is an expression, 2.3
is a floating-point number, and true
is a boolean value.
-
Required: Yes
-
Default: None
split
key
- Description: Primarily used to configure the split key for the MySQL table. The split key must be of type bigInt/string/time, assuming the data distribution based on the split key is uniform.
- Required: No
- Default: None
timeAccuracy
- Description: Primarily used to configure the time split key for the MySQL table, mainly describing the smallest unit of time, such as day, minute, second, millisecond, microsecond, nanosecond.
- Required: No
- Default: None
range
type
- Description: Primarily used to configure the default value type of the split key for the MySQL table. The value can be bigInt/string/time. This will check the type of the split key in the table, so please ensure the type is correct.
- Required: No
- Default: None
left
- Description: Primarily used to configure the default maximum value of the split key for the MySQL table.
- Required: No
- Default: None
right
- Description: Primarily used to configure the default minimum value of the split key for the MySQL table.
- Required: No
- Default: None
where
- Description: Primarily used to configure the WHERE condition for the SELECT statement.
- Required: No
- Default: None
querySql
- Description: In some business scenarios, the "where" configuration item is not sufficient to describe the filtering conditions. Users can use this configuration item to customize the filtering SQL. When users configure this item, the DataX system will ignore the "table", "column", and other configuration items, and directly use the content of this configuration item for data filtering. For example, if you need to perform a multi-table join and then synchronize the data, you can use
select a, b from table_a join table_b on table_a.id = table_b.id
.
When the user configures querySql
, MysqlReader directly ignores the configuration of table
, column
, and where
conditions. The priority of querySql
is higher than that of table
, column
, and where
options.
- Required: No
- Default: None
trimChar
- Description: Specifies whether to remove leading and trailing spaces for char types in MySQL.
- Required: No
- Default: false
Type Conversion
Currently, MysqlReader supports most MySQL types, but there are still some individual types that are not supported. Please check your types carefully.
Below is a conversion table for MysqlReader regarding MySQL types:
go-etl Type |
MySQL Data Type |
bigInt |
int, tinyint, smallint, mediumint, bigint, year,unsigned int, unsigned bigint, unsigned smallint, unsigned tinyint |
decimal |
float, double, decimal |
string |
varchar, char, tinytext, text, mediumtext, longtext |
time |
date, datetime, timestamp, time |
bytes |
tinyblob, mediumblob, blob, longblob, varbinary, bit |
To be tested.
Constraints and Limitations
Database Encoding Issues
Currently, only the utf8 character set is supported.