godi

command module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 20, 2014 License: LGPL-3.0 Imports: 7 Imported by: 0

README

godi stands for "go data integrity" and is a commandline utility to generate signature files from a directory tree. This allows to re-check the tree for consistency, and thus verify the data is intact. This is especially useful if data is retrieved from unreliable media, and copied to another storage device.

As it is very common to verify copy operations, godi is able to copy files in the moment is hashes them, optionally verifying the destination after it was copied.

Usage

# Generate a signature for all files in directory tree/
godi seal tree/

# results in godi-seal.xml file

TODO:

  • nprocs - specify how many parallel gather routines there are
  • abort-on-error - if False, we continue as long as possible, otherwise we abort and interrupt all currently running procedures
  • log-mode - either off, or verbose, and in future, maybe even a binary one which provides a whole lot of additional information
  • Abort if destination file exists - see atomic mode !
  • atomic mode (always on) - on cancel, remove all created files and directories
  • print information about read and write performance to stderr every x seconds (allows to tune readers and writer counts thanks to atomic mode)
  • safe mode - verify after copy

Benefits over MHL

  • Performance
    • godi is up to multiple times faster
    • Those inclined may maximize bandwidth by tuning parallelism
  • Copy or archive on the fly
    • While hashing, you can also transfer the data, reading it only once in the process. With MHL, you need to copy first, and hash afterwards, which reads the data twice. godis operation assumes the storage works correctly, however, there is a safe mode which verifies the copy nonetheless.
    • It will never overwrite existing files.
  • Atomic Operation
    • It will not produce intermediate results, and either finish successfully, or not at all.
    • Particularly useful when copying or archiving, as it will not leave any written file(s), allowing to safely abort and retry at will. The latter is good during performance tuning.
  • Usability
    • godi just works. mhl will fail (for some reason) if it finds a hidden file. godi will just ignore hidden files and symbolic links and otherwise process everything in its way.
    • godi comes with a state of the art commandline interface, allowing to learn the command by using it. No manual required.

Performance

Intermediate results indicate a throughput of up to 900MB/s on 2 cores, which is a little more than twice as fast as the single-threaded mhl.

I am still wondering why it doesn't benefit from more cores.

$ time  ./godi seal ~/Movies
Sealed 479 files with total size of 407.74786MB in 0.47895445400000003s (851.3290989659139 MB/s, 0 errors)

real    0m0.486s
user    0m0.879s
sys 0m0.076s
$ time mhl seal -v -t sha1 -o ~/Movies  ~/Movies
----------------------
Finished generating checksums for: 
   480 file(s) 
   with total filesize of 407 MB (427719586 bytes)
----------------------
Summary:
   480 of 480 files SUCCEEDED
-------------------
End of input.
Finish date in UTC: 2014-07-17 22:02:42
MHL file path(s):
   /Users/byron/Movies/Movies_2014-07-17_220241.mhl
===================

real    0m1.186s
user    0m1.100s
sys 0m0.085s

Development Status

Build Status under construction

LICENSE

This open source software is licensed under GNU Lesser General Public License

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
Varous utilities for testing purposes
Varous utilities for testing purposes
Implements verification of seal files previously written with seal command
Implements verification of seal files previously written with seal command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL