Sometimes we all need to find some duplicate file in our system; this is a very tedious task, expecially if we have to do it “by hand”.
If you are a GNU/Linux users (and you if you are read me, you are), you know that, following the UNIX tradition, there is a tool for everything. And so, GitHub has the solution: fdupes. This is a tool written in C and released under the MIT license for identifying duplicate files residing within specified directories.
Of course, you could build it from scratch, but it’s not necessary.
On Debian based systems you can find it with APT:
On Fedora, CentOS and RHEL, after enabling epel repository:
How to use fdupes
Using fdupes is really easy. For finding duplicates, you have just to:
This command will only look in directory specified as argument, and will print out the list of duplicate files (if there are).
If you to look also in sub-directories, you must add the “-r” option which stands for “recursively”.
And what if you want to see the size of files? Of course you can:
You can specify more than one directory:
and so on.
Then, if you want to delete all the duplicates, just:
This will preserve a copy, and delete everything else.
For a complete list of options:
which will print out
-r --recurse for every directory given follow subdirectories
-R --recurse: for each directory given after this option follow
subdirectories encountered within (note the ':' at
the end of the option, manpage for more details)
-s --symlinks follow symlinks
-H --hardlinks normally, when two or more files point to the same
disk area they are treated as non-duplicates; this
option will change this behavior
-n --noempty exclude zero-length files from consideration
-A --nohidden exclude hidden files from consideration
-f --omitfirst omit the first file in each set of matches
-1 --sameline list each set of matches on a single line
-S --size show size of duplicate files
-m --summarize summarize dupe information
-q --quiet hide progress indicator
-d --delete prompt user for files to preserve and delete all
others; important: under particular circumstances,
data may be lost when using this option together
with -s or --symlinks, or when specifying a
particular directory more than once; refer to the
fdupes documentation for additional information
-N --noprompt together with --delete, preserve the first file in
each set of duplicates and delete the rest without
prompting the user
-v --version display fdupes version
-h --help display this help message
So, now you can clean your filesystem from duplicates!