Documentation ¶
Overview ¶
datatools package is a collection of Go based command line tools for working with JSON content
@Author R. S. Doiel, <rsdoiel@caltech.edu>
Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
datatools.go is a package for working with various types of data (e.g. CSV, XLSX, JSON) in support of the utilities included in the datatools.go package.
Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
datatools package is a collection of Go based command line tools for working with JSON content
@Author R. S. Doiel, <rsdoiel@caltech.edu>
Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
datatools package is a collection of Go based command line tools for working with JSON content
@Author R. S. Doiel, <rsdoiel@caltech.edu>
Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Index ¶
- Constants
- func ApplyStopWords(fields []string, stopWords []string) []string
- func CSVMarshal(fields []string) ([]byte, error)
- func CSVRandomRows(in io.Reader, out io.Writer, showHeader bool, rowCount int, delimiter string, ...) error
- func CSVRows(in io.Reader, out io.Writer, showHeader bool, rowNos []int, delimiter string, ...) error
- func CSVRowsAll(in io.Reader, out io.Writer, showHeader bool, delimiter string, ...) error
- func CodemetaToCitationCff(srcName, destName string) error
- func EnglishTitle(s string) string
- func Filter(c rune, allowableCharacters string, allowPunctuation bool) bool
- func FmtHelp(src string, appName string, version string, releaseDate string, ...) string
- func JSONMarshal(data interface{}) ([]byte, error)
- func JSONMarshalIndent(data interface{}, prefix string, indent string) ([]byte, error)
- func JSONObjectsToCSV(in io.Reader, out io.Writer, eout io.Writer, quiet bool, showHeader bool, ...) error
- func JSONUnmarshal(src []byte, data interface{}) error
- func Levenshtein(src string, target string, insertCost int, deleteCost int, substituteCost int, ...) int
- func NormalizeDelimiter(s string) string
- func NormalizeDelimiterRune(s string) rune
- func ParseRange(s string) ([]int, error)
- func Text2Fields(r *bufio.Reader, options *Options) ([]byte, error)
- type Options
- type SQLCfg
- type SQLStore
Constants ¶
const ( // Constants for datatools functions AsDelimited = iota AsCSV = iota AsJSON = iota )
const ( // Version number of release Version = "1.2.12" // ReleaseDate, the date version.go was generated ReleaseDate = "2024-11-07" // ReleaseHash, the Git hash when version.go was generated ReleaseHash = "1128bff" LicenseText = `` /* 1524-byte string literal not displayed */ )
Variables ¶
This section is empty.
Functions ¶
func ApplyStopWords ¶ added in v0.0.7
ApplyStopWords takes a list of words (array of strings) and removes any occurrences of the stop words return a revised list of words.
func CSVMarshal ¶ added in v0.0.7
CSVMarshal takes a list of strings and returns a byte array of CSV formated output.
func CSVRandomRows ¶ added in v0.0.24
func CSVRandomRows(in io.Reader, out io.Writer, showHeader bool, rowCount int, delimiter string, lazyQuotes, trimLeadingSpace bool) error
CSVRandomRows reads a in, creates a csv Reader and Writer and randomly selectes the rowCount number of rows to write out. If showHeader is true it is excluded from the random row selection and will be written to out before the randomized rows. rowCount is the number of rows to return independent of the header row.
func CSVRows ¶ added in v0.0.24
func CSVRows(in io.Reader, out io.Writer, showHeader bool, rowNos []int, delimiter string, lazyQuotes, trimLeadingSpace bool) error
CSVRows renders the rows numbers in rowNos using the delimiter to out
func CSVRowsAll ¶ added in v0.0.24
func CSVRowsAll(in io.Reader, out io.Writer, showHeader bool, delimiter string, lazyQuotes bool, trimLeadingSpace bool) error
CSVRowsAll renders the all rows in rowNos using the delimiter to out
func CodemetaToCitationCff ¶ added in v1.0.3
CodemetaToCitationCff converts a file in Codemeta.json to CITATION.cff formats.
func EnglishTitle ¶ added in v0.0.18
EnglishTitle - uses an improve capitalization rules for English titles. This is based on the approach suggested in the Go language Cookbook:
http://golangcookbook.com/chapters/strings/title/
func Filter ¶ added in v0.0.7
Filter filters out characters from string. By default it allows letters and numbers through with options for allow punctuation and other specific characters. Returns true if matches filter, false otherwise
func FmtHelp ¶ added in v1.2.5
func FmtHelp(src string, appName string, version string, releaseDate string, releaseHash string) string
FmtHelp lets you process a text block with simple curly brace markup.
func JSONMarshal ¶ added in v1.2.4
JSONMarshal provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().
func JSONMarshalIndent ¶ added in v1.2.4
JSONMarshalIndent provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().
func JSONObjectsToCSV ¶ added in v1.2.9
func JSONObjectsToCSV(in io.Reader, out io.Writer, eout io.Writer, quiet bool, showHeader bool, delimiter string) error
JSONObjectsToCSV takes an JSON array of objects mapping to CSV colum/rows. This works a little like Python csv.DictWriter. In Go a `map[string]interface{}{}` is used to represent the object. If the value is complex then it is rendered as YAML into the cell.
func JSONUnmarshal ¶ added in v1.2.4
JSONUnmarshal is a custom JSON decoder so we can treat numbers easier
func Levenshtein ¶ added in v0.0.7
func Levenshtein(src string, target string, insertCost int, deleteCost int, substituteCost int, caseSensitive bool) int
Levenshtein does a fuzzy match on two strings.
func NormalizeDelimiter ¶ added in v0.0.7
NormalizeDelimiters handles the messy translation from a format string received as an option in the cli to something useful to pass to Join.
func NormalizeDelimiterRune ¶ added in v0.0.11
NormalizeDelimiterRune take a delimiter string and returns a single Rune
func ParseRange ¶ added in v0.0.10
ParseRange takes a string in the form of a "range expression" like 1,2 (one and two), 1-3 (one, two, three) or 1,2,8-10 (one, two, eight, nine, ten) and returns an array of ints holding the values of the range expression.
Types ¶
type Options ¶ added in v0.0.7
type Options struct { AllowCharacters string AllowPunctuation bool ToLower bool ToUpper bool StopWords []string Delimiter string Format int }
Options is the data structure to configure the Text2Fields parser
type SQLCfg ¶ added in v1.1.4
type SQLCfg struct { DSN string `json:"dsn_url,omitempty"` WriteHeaderRow bool `json:"header_row,omitempty"` Delimiter string `json:"delimiter,omitempty"` UseCRLF bool `json:"use_crlf,omitempty"` }
SQLCfg holds the information for connecting to a SQLStore and options for the CSV output.
type SQLStore ¶ added in v1.1.4
type SQLStore struct { // Protocol holds the database type string, e.g. mysql, sqlite, pg Protocol string // Host name of service where to connect Host string // Port of service Port string // Database name you're going to query against Database string // User name for access a database service User string // Password for accessing a database service Password string // WriteHeaderRow tracks desired behavior about generating // a header row in the CSV encoded output. NOTE: using OpenSQLStore() // sets this value to true. WriteHeaderRow bool // contains filtered or unexported fields }
SQLSrouce represents a wrapper SQL database drivers using a common struct.
func OpenSQLStore ¶ added in v1.1.4
OpenSQLStore opens a mysql, postgres or SQLite database based on a data source name expressed as a URL. The URL is formed by using the "protocol" to identify the service (e.g. "mysql://", "sqlite3://", "pg://") followed by a data source name per golang sql package documentation.
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
cmd
|
|
codemeta2cff
codemeta2cff.go converts a codemeta.json file to CITATION.cff.
|
codemeta2cff.go converts a codemeta.json file to CITATION.cff. |
csv2json
csv2json - is a command line that takes CSV input from stdin and writes out JSON expression.
|
csv2json - is a command line that takes CSV input from stdin and writes out JSON expression. |
csv2mdtable
csv2mdtable - is a command line that takes CSV input from stdin and writes out a Github Flavored Markdown table.
|
csv2mdtable - is a command line that takes CSV input from stdin and writes out a Github Flavored Markdown table. |
csv2tab
csv2tab converts a CSV file to tab separated values.
|
csv2tab converts a CSV file to tab separated values. |
csv2xlsx
csv2xlsx is a command line utility that will convert a CSV file and insert it into a named sheet in an Excel Workbook.
|
csv2xlsx is a command line utility that will convert a CSV file and insert it into a named sheet in an Excel Workbook. |
csvcleaner
csvcleaner provides some basic cleaning function that are applied across a csv file.
|
csvcleaner provides some basic cleaning function that are applied across a csv file. |
csvcols
csvcols - is a command line that takes each argument in order and outputs a line in CSV format.
|
csvcols - is a command line that takes each argument in order and outputs a line in CSV format. |
csvfind
csvfind - is a command line that takes CSV files in returns the rows that match a column value.
|
csvfind - is a command line that takes CSV files in returns the rows that match a column value. |
csvjoin
csvjoin - is a command line that takes two CSV files and joins them by match a designated column in each.
|
csvjoin - is a command line that takes two CSV files and joins them by match a designated column in each. |
csvrows
csvrows - is can filter selected rows, out row ranges or turn each command line parameter into a CSV row of output.
|
csvrows - is can filter selected rows, out row ranges or turn each command line parameter into a CSV row of output. |
finddir
finddir - a simple directory tree walker that looks for directories by name, basename or extension.
|
finddir - a simple directory tree walker that looks for directories by name, basename or extension. |
findfile
findfile - a simple directory tree walker that looks for files by name, basename or extension.
|
findfile - a simple directory tree walker that looks for files by name, basename or extension. |
json2toml
json2toml is a command line utility that converts JSON objects to TOML.
|
json2toml is a command line utility that converts JSON objects to TOML. |
json2yaml
json2yaml is a command line utility that converts JSON objects to YAML.
|
json2yaml is a command line utility that converts JSON objects to YAML. |
jsoncols
jsoncols is a command line tool for filter JSON data from standard in or specified files.
|
jsoncols is a command line tool for filter JSON data from standard in or specified files. |
jsonjoin
jsonjoin is a command line tool that takes two JSON documents and combined them into one depending on the options
|
jsonjoin is a command line tool that takes two JSON documents and combined them into one depending on the options |
jsonmunge
jsonmunge is a command line tool that takes a JSON document and a Go text/template rendering the result.
|
jsonmunge is a command line tool that takes a JSON document and a Go text/template rendering the result. |
jsonobjects2csv
jsonobjects2csv is a command line utility that converts a JSON list of objects to CSV.
|
jsonobjects2csv is a command line utility that converts a JSON list of objects to CSV. |
jsonrange
jsonrange iterates over an array or map returning either a JSON expression or map keep to stdout
|
jsonrange iterates over an array or map returning either a JSON expression or map keep to stdout |
mergepath
mergepath.go - merge the path variable to avoid duplicates
|
mergepath.go - merge the path variable to avoid duplicates |
range
range - emit a list of integers separated by spaces starting from first command line parameter to last command line parameter.
|
range - emit a list of integers separated by spaces starting from first command line parameter to last command line parameter. |
reldate
Generates a date in YYYY-MM-DD format based on a relative time description (e.g.
|
Generates a date in YYYY-MM-DD format based on a relative time description (e.g. |
reltime
Generates a time in HH:MM:SS format based on a relative time description (e.g.
|
Generates a time in HH:MM:SS format based on a relative time description (e.g. |
string
string is a command line utility to expose some of the Golang strings functions to the command line.
|
string is a command line utility to expose some of the Golang strings functions to the command line. |
tab2csv
tabs2csv converts a tab delimited file to a CSV formatted file.
|
tabs2csv converts a tab delimited file to a CSV formatted file. |
timefmt
timefmt formats a date based on the formatting options available with Golang's Time.Format
|
timefmt formats a date based on the formatting options available with Golang's Time.Format |
toml2json
toml2json is a command line utility that converts an TOML to JSON.
|
toml2json is a command line utility that converts an TOML to JSON. |
urlparse
urlparse - a URL Parser library for use in Bash scripts.
|
urlparse - a URL Parser library for use in Bash scripts. |
xlsx2csv
xlsx2csv is a command line utility that converts individual Excel Workbook Sheets to CSV.
|
xlsx2csv is a command line utility that converts individual Excel Workbook Sheets to CSV. |
xlsx2json
xlsx2json is a command line utility that converts an Excel Workboom Sheet into JSON.
|
xlsx2json is a command line utility that converts an Excel Workboom Sheet into JSON. |
yaml2json
yaml2json is a command line utility that converts an YAML to JSON.
|
yaml2json is a command line utility that converts an YAML to JSON. |
Package reldate generates a date in YYYY-MM-DD format based on a relative time description (e.g.
|
Package reldate generates a date in YYYY-MM-DD format based on a relative time description (e.g. |
timefmt provides additional common formats found around the web that are missing from Golang's own time package.
|
timefmt provides additional common formats found around the web that are missing from Golang's own time package. |