duplicate-files-finder

command module
v0.0.0-...-7d8f239 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 7, 2022 License: MIT Imports: 9 Imported by: 0

README

duplicate-files-finder

Finds duplicate files in a given directory using file size and sha1 checksum.

    foo@bar:~$ go run github.com/nilesh-akhade/duplicate-files-finder@latest -r --dir "$HOME/Downloads"
    Finding duplicate files...
    --------------------------------------------------
    Directory                    : /home/foo/Downloads
    Recursive                    : true
    Total files                  : 140087
    Unique files                 : 67500
    Files which are duplicated   : 72587
    Space taken by the duplicates: 192.83MB
    --------------------------------------------------

Design

This is an alternate implementation of the ThoughtWorks Clean Code Workshop Assignment

I tried to use SOLID, clean code principles. Feel free to open an issue, if you see any of the principle violated.

Next?

To test extensiblity and robustness of this design, a dummy customer will keep on changing the requirements as follows. A good design will have less code change, least refactoring to existing code.

  • We need names of the duplicate files
  • Program fails when it detects, non readable or system files
  • Our directory has 1 million files, this program gets out of memory
  • Program takes long time we want pause and resume functionality
  • Program takes long time can we use faster checksum algorithm instead of sha1
  • The program is still slow; can you use better data structure?
  • The program shows nothing, provide functionality to show progress.
  • Support google drive in addition to the local filesystem
  • We need similar files not exactly duplicates. Like file containing "Hello World" is 50% similar to file containing "World"

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL