README ¶
This contains the implementation of the types.Mlrval
datatype which is used for record values, as well as expression/variable values in the Miller put
/filter
DSL.
Mlrval
The types.Mlrval
structure includes string, int, float, boolean, array-of-mlrval, map-string-to-mlrval, void, absent, and error types as well as type-conversion logic for various operators.
- Miller's
absent
type is like Javascript'sundefined
-- it's for times when there is no such key, as in a DSL expression$out = $foo
when the input record is$x=3,y=4
-- there is no$foo
so$foo
hasabsent
type. Nothing is written to the$out
field in this case. See also here for more information. - Miller's
void
type is like Javascript'snull
-- it's for times when there is a key with no value, as in$out = $x
when the input record is$x=,$y=4
. This is an overlap withstring
type, since a void value looks like an empty string. I've gone back and forth on this (including when I was writing the C implementation) -- whether to retainvoid
as a distinct type from empty-string, or not. I ended up keeping it as it made theMlrval
logic easier to understand. - Miller's
error
type is for things like doing type-uncoerced addition of strings. Data-dependent errors are intended to result in(error)
-valued output, rather than crashing Miller. See also here for more information. - Miller's number handling makes auto-overflow from int to float transparent, while preserving the possibility of 64-bit bitwise arithmetic.
- This is different from JavaScript, which has only double-precision floats and thus no support for 64-bit numbers (note however that there is now
BigInt
). - This is also different from C and Go, wherein casts are necessary -- without which int arithmetic overflows.
- Using
$a * $b
in Miller will auto-overflow to float. Using$a .* $b
will stick with 64-bit integers (if$a
and$b
are already 64-bit integers). - More generally:
- Bitwise operators such as
|
,&
, and^
map ints to ints. - The auto-overflowing math operators
+
,*
, etc. map ints to ints unless they overflow in which case float is produced. - The int-preserving math operators
.+
,.*
, etc. map ints to ints even if they overflow.
- Bitwise operators such as
- See also here for the semantics of Miller arithmetic, which the
Mlrval
class implements.
- This is different from JavaScript, which has only double-precision floats and thus no support for 64-bit numbers (note however that there is now
- Since a Mlrval can be of type array-of-mlrval or map-string-to-mlrval, a Mlrval is suited for JSON decoding/encoding.
Mlrmap
types.Mlrmap
is the sequence of key-value pairs which represents a Miller record. The key-lookup mechanism is optimized for Miller read/write usage patterns -- please see mlrmap.go
for more details.
It's also an ordered map structure, with string keys and Mlrval values. This is used within Mlrval itself.
Context
types.Context
supports AWK-like variables such as FILENAME
, NF
, NR
, and so on.
A note on JSON
- The code for JSON I/O is mixed between
Mlrval
and `Mlrmap. This is unsurprising since JSON is a mutually recursive data structure -- arrays can contain maps and vice versa. - JSON has non-collection types (string, int, float, etc) as well as collection types (array and object). Support for objects is principally in ./mlrmap_json.go; support for non-collection types as well as arrays is in ./mlrval_json.go.
- Both multi-line and single-line formats are supported.
- Callsites for JSON output are record-writing (e.g.
--ojson
), thedump
andprint
DSL routines, and thejson_stringify
DSL function.- The choice between single-line and multi-line for JSON record-writing is controlled by
--jvstack
and--no-jvstack
, the former (multiline) being the default. - The
dump
andprint
DSL routines produce multi-line output without a way for the user to choose single-line output. - The
json_stringify
DSL function lets the user specify multi-line or single-line, with the former being the default,
- The choice between single-line and multi-line for JSON record-writing is controlled by
Documentation ¶
Overview ¶
Package types contains the implementation of the Mlrval datatype which is used for record values, as well as expression/variable values in the Miller put/filter DSL.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func NewEndOfStreamMarkerList ¶
TODO: comment For the record-readers to update their initial context as each new record is read.
Types ¶
type Context ¶
type Context struct { FILENAME string FILENUM int64 // This is computed dynammically from the current record's field-count // NF int NR int64 FNR int64 // XXX 1513 JSONHadBrackets bool }
----------------------------------------------------------------
func NewContext ¶
func NewContext() *Context
TODO: comment: Remember command-line values to pass along to CST evaluators. The options struct-pointer can be nil when invoked by non-DSL verbs such as join or seqgen.
func NewNilContext ¶
func NewNilContext() *Context
TODO: comment: Remember command-line values to pass along to CST evaluators. The options struct-pointer can be nil when invoked by non-DSL verbs such as join or seqgen.
func (*Context) GetStatusString ¶
func (*Context) UpdateForInputRecord ¶
func (context *Context) UpdateForInputRecord()
For the record-readers to update their initial context as each new record is read.
func (*Context) UpdateForStartOfFile ¶
For the record-readers to update their initial context as each new file is opened.
type RecordAndContext ¶
type RecordAndContext struct { Record *mlrval.Mlrmap Context Context OutputString string EndOfStream bool }
func NewEndOfStreamMarker ¶
func NewEndOfStreamMarker(context *Context) *RecordAndContext
For the record-readers to update their initial context as each new record is read.
func NewOutputString ¶
func NewOutputString( outputString string, context *Context, ) *RecordAndContext
For print/dump/etc to insert strings sequenced into the record-output stream. This avoids race conditions between different goroutines printing to stdout: we have a single designated goroutine printing to stdout. This makes output more predictable and intuitive for users; it also makes our regression tests run reliably the same each time.
func NewRecordAndContext ¶
func NewRecordAndContext( record *mlrval.Mlrmap, context *Context, ) *RecordAndContext
func (*RecordAndContext) Copy ¶
func (rac *RecordAndContext) Copy() *RecordAndContext
For the record-readers to update their initial context as each new record is read.
type TypeGatedMlrvalName ¶
----------------------------------------------------------------
func NewTypeGatedMlrvalName ¶
func NewTypeGatedMlrvalName( name string, typeName string, ) (*TypeGatedMlrvalName, error)
type TypeGatedMlrvalVariable ¶
type TypeGatedMlrvalVariable struct {
// contains filtered or unexported fields
}
----------------------------------------------------------------
func (*TypeGatedMlrvalVariable) Assign ¶
func (tvar *TypeGatedMlrvalVariable) Assign(value *mlrval.Mlrval) error
func (*TypeGatedMlrvalVariable) GetName ¶
func (tvar *TypeGatedMlrvalVariable) GetName() string
func (*TypeGatedMlrvalVariable) GetValue ¶
func (tvar *TypeGatedMlrvalVariable) GetValue() *mlrval.Mlrval
func (*TypeGatedMlrvalVariable) Unassign ¶
func (tvar *TypeGatedMlrvalVariable) Unassign()
func (*TypeGatedMlrvalVariable) ValueString ¶
func (tvar *TypeGatedMlrvalVariable) ValueString() string