NAME
dtgrep - print lines matching a date range
SYNOPSIS
dtgrep --from RFC3339 --to RFC3339 --format TIME_LAYOUT syslog
DESCRIPTION
Do you even remember how often in your life you needed to find lines in
a log in a date range? And how often you build brittle regexps in grep
to match entries spanning over an hour change?
With dtgrep you don't have to. It features
- efficient binary search on normal files
- read bzip and gzip files without external dependencies
- automatically sort files
- merge lines from different files in output stream
- do as little work as necessary
- flexible syntax to declare date ranges
EXAMPLES
But just let me show you a few examples.
The only parameter dtgrep really needs is format to tell it how to
reckognize a timestamp. In this case dtgrep matches all lines from epoch to
the time dtgrep started.
dtgrep --format "Jan _2 15:04:05" syslog
There are also some already predefined formats you can use:
dtgrep --format apache access.log
You can specify which timerange to print:
dtgrep --from 2006-01-02T12:00:00 --to 2006-01-02T12:15:00 syslog
If you leave one out it either defaults to epoch or the start of the program.
dtgrep --to 2006-01-02T12:15:00 --format rsyslog syslog
dtgrep can also read lines from stdin, but filtering those will be
slower as you can't just seek in a pipe. It's often more efficient to
just redirect the lines from the pipe to a file first. But nothing is
stopping you to just call dtgrep directly.
zcat syslog.gz | dtgrep --to 2006-01-02T12:15:00
dtgrep --to 2006-01-02T12:15:00 syslog.gz
OPTIONS
-
--from DATESPEC
Print all lines from RFC3339 inclusively. Defaults to January 1,
year 1, 00:00:00 UTC. See DATESPECS for valid arguments.
-
--to DATESPEC
Print all lines until RFC3339 exclusively. Default to the current
time. See DATESPECS for valid arguments.
-
--format FORMAT
FORMAT describes how a date looks. The first date found on a line is used.
FORMAT can either a named format or any layout supported by the time package.
Additionally, dtgrep supports named formats:
- rsyslog "Jan _2 15:04:05"
- apache "02/Jan/2006:15:04:05 -0700"
- iso3339 "2006-01-02T15:04:05Z07:00"
This parameter defaults to rsyslog.
-
--multiline
Print lines without timestamp between matching lines.
-
--skip-dateless
Ignore lines without timestamp.
-
--location LOCATION
If a date has no explicit timezone, interpret it as in the given
location. LOCATION must be a valid location from the IANA Time Zone
database, such as "America/New_York".
If the name is "" or "UTC, interpret dates as UTC.
This parameter defaults to the system's local time zone.
-
--help
Shows a short help message
DATESPECS
A datespec consists of a datetime and any numbers of modifiers. A
datetime can be an imcomplete date, in this case the missing values
will be filled with the current date. Without a timezone designator,
the local timezone will be used. The following formats are supported
- 04
- 15:04
- 15:04:05
- 2006-01-02 15:04
- 2006-01-02 15:04:05
- 2006-01-02 15:04:05Z07:00
- 2006-01-02T15:04:05Z07:00
- now
A modifier can either be a truncate or add statement. Both expect a duration as argument.
- Truncate will round the date down to the next multiple of its duration
- Add adds the duration to the current date.
Duration can be a any value parsable by the ParseDuration function.
Some examples
- "15:06 truncate 5m add -5m" results in 15:00 today
- "now truncate 24h add -24h" is the beginning of yesterday
- "00:00 add -24h" is also the start of the last day.
- "now"
ENVIRONMENT
LIMITATION
dtgrep expects the files to be sorted. If the timestamps are not
ascending, dtgrep might be exiting before the last line in its date
range is printed. So, no daylight saving time.
SEE ALSO
This is by far no new idea. Just on github you can find at least 10
programs that at least the binary search part of the problem. Search
for tgrep, timegrep or dategrep.
I wrote dategrep without knowing
about these. What is weird, as I distinctly remember that i searched
extensively for such a program but I must have choosen very poor terms.
Since then i never lost interest in the problem. This is another try to
find the right mix of features, easy deployment and fast results.
COPYRIGHT AND LICENSE
Copyright 2016 Mario Domgoergen mario@domgoergen.com
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.