Purpose of the REPL
The Miller read-evaluate-print loop is an interactive counterpart to record-processing using the put/filter domain-specific language.
Using put and filter, you can do the following:
- Specify input format (e.g.
--icsv
), output format (e.g. --ojson
), etc. using command-line flags.
- Specify filenames on the command line.
- Define
begin {...}
blocks which are executed before the first record is read.
- Define
end {...}
blocks which are executed after the last record is read.
- Define user-defined functions/subroutines using
func
and subr
.
- Specify statements to be executed on each record -- which are anything outside of
begin
/end
/func
/subr
.
- Example:
mlr --icsv --ojson put 'begin {print "HELLO"} $z = $x + $y; end {print "GOODBYE"}
Using the REPL, by contrast, you get interactive control over those same steps:
- Specify input format (e.g.
--icsv
), output format (e.g. --ojson
), etc. using command-line flags.
- Specify filenames either on the command line or via
:open
at the Miller REPL.
- Read records one at a time using
:read
.
- Skip ahead using statements
:skip 10
or :skip until NR == 100
or :skip until $status_code != 200
.
- Similarly, but processing records rather than skipping past them, using
:process
rather than :skip
.
- Define
begin {...}
blocks; invoke them at will using :begin
.
- Define
end {...}
blocks; invoke them at will using :end
.
- Define user-defined functions/subroutines using
func
/subr
; call them from other statements.
- Interactively specify statements to be executed on the current record.
- Load any of the above from Miller-script files using
:load
.
- Furthermore, any DSL statements other than
begin
/end
/func
/subr
loaded using :load
-- or from multiline input mode which is where you type <
on a line by itself, enter the code, then type >
on a line by itself -- will be remembered and can be invoked on a given record using :main
. In multiline mode and load-from-file, semicolons are required between statements; otherwise they are not needed.
At this REPL prompt you can enter any Miller DSL expression. REPL-only statements (non-DSL statements) start with :
, such as :help
or :quit
. Type :help
to see more about your options.
No command-line-history-editing feature is built in but rlwrap mlr repl
is a delight. You may need brew install rlwrap
, sudo apt-get install rlwrap
, etc. depending on your platform.
The input "record" by default is the empty map but you can do things like $x=3
, or unset $y
, or $* = {"x": 3, "y": 4}
to populate it. Or, :open foo.dat
followed by :read
to populate it from a data file.
Non-assignment expressions, such as 7
or true
, operate as filter conditions in the put DSL: they can be used to specify whether a record will or won't be included in the output-record stream. But here in the REPL, they are simply printed to the terminal, e.g. if you type 1+2
, you will see 3
.
Examples:
$ mlr repl
Miller v6.0.0-dev
Type ':help' for on-line help; ':quit' to quit.
[mlr]
[mlr] 1+2
3
[mlr] x=3 # These are local variables
[mlr] y=4
[mlr] x+y
7
[mlr] <
func f(a,b) {
return a**b
}
>
[mlr] f(7,5)
16807
[mlr] :open foo.dat
[mlr] :read
[mlr] :context
FILENAME="foo.dat",FILENUM=1,NR=1,FNR=1
[mlr] $*
{
"a": "eks",
"b": "wye",
"i": 4,
"x": 0.38139939387114097,
"y": 0.13418874328430463
}
[mlr] f($x,$i)
0.021160211005187134
[mlr] $z = f($x, $i)
[mlr] $*
{
"a": "eks",
"b": "wye",
"i": 4,
"x": 0.38139939387114097,
"y": 0.13418874328430463,
"z": 0.021160211005187134
}
Implementation of the REPL
This is a small modification around the CST and the put
verb.. Most of the keystroking here is for online help and command-line parsing.
One subtlety is that non-assignment expressions like NR < 10
are filter statements within put
-- they can be used to control whether or not a given record is included in the output stream. Here, in the REPL, these expressions are simply printed to the terminal. And for :skip until ...
or :process until ...
, they're used as the exit condition to break out of reading input records.
File structure
- types.go -- data types including the
Repl
class
- entry.go -- shell command-line entry point to the Miller repl command line. E.g. handles
mlr repl --json
which is typed at the shell prompt, and starts a command-line session at the Miller REPL prompt.
- session.go -- constructs a
Repl
object and ingests command lines, dispatching them either to the DSL (e.g. $z = $x + $y
) or to the non-DSL verb handler (e.g. :open foo.dat
or help
).
- prompt.go -- Handling for default and customized banners/prompts for the Miller REPL.
- dsl.go -- Handler for taking DSL statements typed in interactively by the user, parsing them to an AST, building a CST from the AST, and executing the CST.
- ast.go -- Interface between the REPL and the DSL-to-AST parser.
- verbs.go -- Handlers for non-DSL statements like
:open foo.dat
or :help
.