README
¶
Extractor
A simple utility to extract, list columns from CSV or split files partitioned by a column.
Usage and Examples
There are mainly two commands as of now, list
and extract
.
Usage: extractor <command>
Flags:
-h, --help Show context-sensitive help.
-v, --verbose Enable debug mode
--sep="comma" Separator to be used.
Commands:
list List CSV Columns
extract Extracts columns from CSV
Run "extractor <command> --help" for more information on a command.
List
This command helps to list the columns available.
Usage: extractor list <input>
List CSV Columns
Arguments:
<input> Input filename
Flags:
-h, --help Show context-sensitive help.
-v, --verbose Enable debug mode
--sep="comma" Separator to be used.
Example:
$❯ extractor list eng.4.csv
* Round
* Date
* Team 1
* FT
* Team 2
Extract
This command will start extracting the columns from the input file to output
file. If in debug mode, --count
specifies the update frequency to show progress
Usage: extractor extract <input> <output> <columns> ...
Extracts columns from CSV
Arguments:
<input> Input filename
<output> Output filename
<columns> ... Columns to extract
Flags:
-h, --help Show context-sensitive help.
-v, --verbose Enable debug mode
--sep="comma" Separator to be used.
--count=1000 Frequency to show progress
Example:
$❯ extractor extract eng.4.csv /dev/stdout "Team 1" "Team 2"
Team 1,Team 2
Barrow AFC,Stevenage FC
Bolton Wanderers FC,Forest Green Rovers FC
Bradford City AFC,Colchester United FC
Cambridge United FC,Carlisle United FC
Cheltenham Town FC,Morecambe FC
Walsall FC,Grimsby Town FC
Mansfield Town FC,Tranmere Rovers FC
Oldham Athletic AFC,Leyton Orient FC
Port Vale FC,Crawley Town FC
...
$❯ extractor extract eng.4.csv eng.4.teams.csv "Team 1" "Team 2" -v --count 10
Opening input file... eng.4.csv
Opening output file... eng.4.teams.csv
Starting extraction...
Extracted 553 records.
Finished.
Partition
This command helps to create files partitioned by the provided column. It does support verbose mode to print more details about the tasks it is running.
To drop the partitioned column one can use --drop
so the output files won't have that.
Prefix and Suffix are identified by the name of file splitting at the last .
and putting -
and value in between.
Usage: extractor partition <input> <column>
Partitions a file based on the given column from CSV
Arguments:
<input> Input filename
<column> Column to use to split by
Flags:
-h, --help Show context-sensitive help.
-v, --verbose Enable debug mode
--sep="comma" Separator to be used.
--prefix=STRING Prefix of the output file
--suffix=STRING Suffix of the output file
--drop Drop Column in output file(s)
Example:
$❯ head -5 timezone.csv # Check the file
Value,Label,Group
Africa/Abidjan,Abidjan,Africa
Africa/Accra,Accra,Africa
Africa/Addis_Ababa,Addis Ababa,Africa
Africa/Algiers,Algiers,Africa
$❯ extractor partition timezone.csv Group # Run partition command
$❯ ls -1 timezone-*.csv
timezone-Africa.csv
timezone-America.csv
timezone-Antarctica.csv
...
timezone-Pacific.csv
timezone-UTC.csv
$❯ head -5 timezone-Pacific.csv
Value,Label,Group
Pacific/Apia,Apia,Pacific
Pacific/Auckland,Auckland,Pacific
Pacific/Bougainville,Bougainville,Pacific
Pacific/Chatham,Chatham,Pacific
Example:
$❯ # Run partition command and drop the partitioned column
$❯ extractor partition --drop timezone.csv Group
$❯ head -5 timezone-Pacific.csv
Value,Label
Pacific/Apia,Apia
Pacific/Auckland,Auckland
Pacific/Bougainville,Bougainville
Pacific/Chatham,Chatham
Documentation
¶
There is no documentation for this package.