Remove duplicate files with fdupes

Introduction

Sometimes we all need to find some duplicate file in our system; this is a very tedious task, expecially if we have to do it “by hand”.
If you are a GNU/Linux users (and you if you are read me, you are), you know that, following the UNIX tradition, there is a tool for everything. And so, GitHub has the solution: fdupes. This is a tool written in C and released under the MIT license for identifying duplicate files residing within specified directories.

Getting fdupes

Of course, you could build it from scratch, but it’s not necessary.
On Debian based systems you can find it with APT:

$ sudo apt-get install fdupes

On Fedora, CentOS and RHEL, after enabling epel repository:

# yum install fdupes
# dnf install fdueps

How to use fdupes

Using fdupes is really easy. For finding duplicates, you have just to:

$ fdupes /path/to/some/directory

This command will only look in directory specified as argument, and will print out the list of duplicate files (if there are).
If you to look also in sub-directories, you must add the “-r” option which stands for “recursively”.
And what if you want to see the size of files? Of course you can:

$ fdupes -S /path/to/some/directory

You can specify more than one directory:

$ fdupes /path/to/first/directory /path/to/second/directory

and so on.

Then, if you want to delete all the duplicates, just:

$ fdupes -d /path/to/directory

This will preserve a copy, and delete everything else.

Conclusion

For a complete list of options:

$ fdupes -h

which will print out

Usage: fdupes [options] DIRECTORY...
-r --recurse for every directory given follow subdirectories
encountered within
-R --recurse: for each directory given after this option follow
subdirectories encountered within (note the ':' at
the end of the option, manpage for more details)
-s --symlinks follow symlinks
-H --hardlinks normally, when two or more files point to the same
disk area they are treated as non-duplicates; this
option will change this behavior
-n --noempty exclude zero-length files from consideration
-A --nohidden exclude hidden files from consideration
-f --omitfirst omit the first file in each set of matches
-1 --sameline list each set of matches on a single line
-S --size show size of duplicate files
-m --summarize summarize dupe information
-q --quiet hide progress indicator
-d --delete prompt user for files to preserve and delete all
others; important: under particular circumstances,
data may be lost when using this option together
with -s or --symlinks, or when specifying a
particular directory more than once; refer to the
fdupes documentation for additional information
-N --noprompt together with --delete, preserve the first file in
each set of duplicates and delete the rest without
prompting the user
-v --version display fdupes version
-h --help display this help message

So, now you can clean your filesystem from duplicates!