Questions tagged [duplicate-files]

21 questions
201
votes
20 answers

Is there an easy way to replace duplicate files with hardlinks?

I'm looking for an easy way (a command or series of commands, probably involving find) to find duplicate files in two directories, and replace the files in one directory with hardlinks of the files in the other directory. Here's the situation: This…
Josh
  • 8,311
  • 12
  • 54
  • 73
115
votes
13 answers

Find duplicate files

Is it possible to find duplicate files on my disk which are bit to bit identical but have different file-names?
student
  • 17,875
  • 31
  • 103
  • 169
81
votes
3 answers

What's the quickest way to find duplicated files?

I found this command used to find duplicated files but it was quite long and made me confused. For example, if I remove -printf "%s\n", nothing came out. Why was that? Besides, why have they used xargs -I{} -n1? Is there any easier way to find…
The One
  • 4,662
  • 11
  • 29
  • 35
24
votes
5 answers

Finding duplicate files and replace them with symlinks

I'm trying to find a way to check inside a given directory for duplicate files (even with different names) and replace them with symlinks pointing to the first occurrence. I've tried with fdupes but it just lists those duplicates. That's the…
Sekhemty
  • 824
  • 1
  • 12
  • 27
20
votes
8 answers

case-insensitive search of duplicate file-names

I there a way to find all files in a directory with duplicate filenames, regardless of the casing (upper-case and/or lower-case)?
lamcro
  • 893
  • 1
  • 8
  • 12
2
votes
1 answer

Remove all duplicate image files except for 1

I have a folder of images that contain quite a bit of duplicates, I'd like to remove all duplicates except for one. Upon Googling I found this clever script from this post that succinctly does almost what I want it to do: #!/bin/sh -eu find…
2
votes
2 answers

Delete duplicates from another directory recursively

(N.B. There are many similar questions (e.g. here, here, here, and here) but they either assume that the directory structure is one-deep, or the answers are more complex multi-line scripts.) This is my situation: . ├── to_keep │   ├── a │   │   └──…
dumbledad
  • 121
  • 5
2
votes
0 answers

Evaluate the similarity between two video files

I'm looking for a method that reliably gives me a "mathematical distance" of two videos from one another. Similar to how the Levenshtein distance can be used to get the distance from a string to another. Is there a command or sequence of commands…
What
  • 356
  • 1
  • 12
1
vote
1 answer

How to use rmlint to merge two large folders?

In exploring options to merge two folders, I've come across a very powerful tool known as rmlint. It has some useful documentation (and Gentle Guide). I have a scenario that I previously mentioned and to which I received some great answers: How to…
ylluminate
  • 591
  • 7
  • 16
1
vote
0 answers

Copy unique files to new directory?

I have a number of folders with my various media (e.g. photos, music) from different points in time. The different folders have some of the same content (e.g. a photo might be in 2 folders), but should be mostly unique. There are no guarantees on…
Vasu
  • 111
  • 1
1
vote
3 answers

What is the most efficient way to find duplicate files?

I have a number of folders with a few million files (amounting to a few TB) in total. I wish to find duplicates across all files. The output ideally is a simple list of dupes - I will process them further with my own scripts. I know that there is…
Ned64
  • 8,486
  • 9
  • 48
  • 86
1
vote
2 answers

Using FSlint to find duplicates by file size only?

I'm trying to use fslint to find duplicates, but it takes forever hashing entire multi-gigabyte files. According to this website, I can compare by the following features: feature summary compare by file size compare by hardlinks compare by md5…
SurpriseDog
  • 572
  • 3
  • 18
0
votes
1 answer

How can you list a directory using the inode not the directory name? I have the same directory name appearing twice with different inodes

When I do a directory listing of a python installation the include directory appears twice and each one has a different inode. ╰─○ ls -i1 2282047 bin 2641630 include 2642559 include 2282048 lib 2641850 share I assume that their contents may be…
vfclists
  • 7,215
  • 14
  • 51
  • 79
0
votes
1 answer

Find Duplicate Files, but Specify a Directory to Keep

I am working on de-cluttering a company shared drive, and looking to remove duplicates. Is there any duplicate finding program that allows you to specify which directory's duplicates are to be removed? I would like to be able to do: fdupes -rdN…
0
votes
2 answers

How to use `rmlint` to remove duplicates only from one location and leave all else untouched?

I have two locations /path/to/a and /path/to/b. I need to find duplicate files in both paths and remove only items in /path/to/b items. rmlint generates quite a large removal script, but it contains entries from both paths (and even empty folders)…
ylluminate
  • 591
  • 7
  • 16
1
2