README ¶
csvutil
Command line utility for transforming CSV files.
Design Choices
- Human-readable. Operations and arguments should be easy to remember. There is no shorthand syntax.
- Safe. External programs that can properly escape shell strings should be able to spawn csvutil processes safely.
- Not a database. All operations are done in memory and there are no fancy tricks to speed things up.
Install
Csvutil may be installed directly from the source repository on a system with Go 1.16 or higher, and may be done like so:
$ go install github.com/evantbyrne/csvutil@latest
Note: The $ at the beginning of newlines in this document represents the bash shell prompt, and is not a part of the actual commands.
Usage
$ csvutil source1.csv --operation1 --operation2 source2.csv --operation1 ...
There are two types of arguments that may be passed to csvutil:
- Sources. These are CSV files to read from.
- Operations denoted with the
--
prefix. These transform the current source.
Upon success, csvutil prints the resulting CSV data to STDOUT.
Examples
Here are a few examples of working with csvutil.
Reading files
$ csvutil user.csv
username,id,group_id
foo,1,10
bar,2,
baz,3,20
four,4,20
five,5,
$ csvutil group.csv
name,id
Admin,10
Moderator,20
Filtering
$ csvutil user.csv --where username == baz --or group_id != 20
username,id,group_id
foo,1,10
bar,2,
baz,3,20
five,5,
Join
$ csvutil user.csv group.csv --join group_id == id
username,id,group_id,name,id
foo,1,10,Admin,10
baz,3,20,Moderator,20
four,4,20,Moderator,20
Sort and select
$ csvutil test/user.csv --sort username ALPHA DESC --select id,username
id,username
4,four
1,foo
5,five
3,baz
2,bar
Operations
Concat
Appends one source onto another.
No arguments.
$ csvutil foo.csv
id,name
1,Foo
2,Bar
$ csvutil bar.csv
id,name
3,Baz
$ csvutil foo.csv bar.csv --concat
id,name
1,Foo
2,Bar
3,Baz
Count
Returns the total number of rows for the current source.
Arguments:
- Column name.
$ csvutil user.csv --count total
total
5
Distinct
Removes duplicates from the current source.
Arguments:
- Comma-separated list of columns that must match to be considered duplicate. Use
"*"
to compare all columns.
$ csvutil foo.csv
id,name,group_id
1,Foo,10
2,Bar,10
3,Foo,20
4,Foo,10
5,Foobar,20
1,Baz,20
$ csvutil foo.csv --distinct name
id,name,group_id
1,Foo,10
2,Bar,10
5,Foobar,20
1,Baz,20
$ csvutil foo.csv --distinct name,group_id
id,name,group_id
1,Foo,10
2,Bar,10
3,Foo,20
5,Foobar,20
1,Baz,20
Except
Remove rows from the previous source if they appear in the current source.
Arguments:
- Comma-separated list of columns that must match to be considered duplicate. Use
"*"
to compare all columns.
$ csvutil foo.csv
id,name
1,Foo
2,Bar
3,Baz
4,Foobar
$ csvutil bar.csv
id,name
2,Two
3,Three
5,Five
6,Six
$ csvutil foo.csv bar.csv --except id
id,name
1,Foo
4,Foobar
Join
Perform an inner join on two sources.
Arguments:
- Column from previous source to compare.
- Operation. Choices:
==
,!=
- Column from the current source to compare.
$ csvutil user.csv
username,id,group_id
foo,1,10
bar,2,
baz,3,20
four,4,20
five,5,
$ csvutil group.csv
name,id
Admin,10
Moderator,20
$ csvutil user.csv group.csv --join group_id == id
username,id,group_id,name,id
foo,1,10,Admin,10
baz,3,20,Moderator,20
four,4,20,Moderator,20
Select
Choose a subset of columns from the current source.
Arguments:
- Comma-separated list of columns to select.
$ csvutil foo.csv
id,name,group_id
1,Foo,10
2,Bar,10
3,Baz,20
$ csvutil foo.csv --select name,id
name,id
Foo,1
Bar,2
Baz,3
Sort
Reorder the current source.
Arguments:
- Column to compare.
- Algorithm. Choices:
ALPHA
,FLOAT
,INT
- Order. Choices:
ASC
,DESC
$ csvutil user.csv
username,id,group_id
foo,1,10
bar,2,
baz,3,20
four,4,20
five,5,
$ csvutil user.csv --sort username ALPHA ASC
username,id,group_id
bar,2,
baz,3,20
five,5,
foo,1,10
four,4,20
When using FLOAT
and INT
, values that cannot be converted will be considered lower than values which may be converted, and they will be compared to each other alphabetically. e.g.,
$ csvutil baz.csv
year,cost
10,500
2,200.10
2000,50.01
x,x
2019,1000.99
y,z
z,y
$ csvutil baz.csv --sort year INT DESC
year,cost
2019,1000.99
2000,50.01
10,500
2,200.10
z,y
y,z
x,x
Values
Strip the header from the current source.
$ csvutil user.csv --count total
total
5
$ csvutil user.csv --count total --values
5
Where
Filter the current source.
Arguments:
- Column name to compare.
- Operator. Choices:
==
,!=
,IN
,NOT_IN
- Value to look for.
$ csvutil user.csv
username,id,group_id
foo,1,10
bar,2,
baz,3,20
four,4,20
five,5,
$ csvutil user.csv --where username == baz --or group_id != 20
username,id,group_id
foo,1,10
bar,2,
baz,3,20
five,5,
The IN
and NOT_IN
operators take column-separated lists of values. e.g.,
$ csvutil user.csv --where id IN 1,3
username,id,group_id
foo,1,10
baz,3,20
Operations may be chained to perform "and" conditionals. e.g.,
$ csvutil user.csv --where username != baz --where id != 2
username,id,group_id
foo,1,10
four,4,20
five,5,
Documentation ¶
There is no documentation for this package.