Questions tagged [uniq]
173 questions
168
votes
5 answers
What is the difference between "sort -u" and "sort | uniq"?
Everywhere I see someone needing to get a sorted, unique list, they always pipe to sort | uniq. I've never seen any examples where someone uses sort -u instead. Why not? What's the difference, and why is it better to use uniq than the unique flag to…
Benubird
- 5,752
- 10
- 36
- 41
104
votes
13 answers
How can I remove duplicates in my .bash_history, preserving order?
I really enjoying using control+r to recursively search my command history. I've found a few good options I like to use with it:
# ignore duplicate commands, ignore commands starting with a space
export HISTCONTROL=erasedups:ignorespace
# keep the…
cwd
- 44,479
- 71
- 146
- 167
60
votes
4 answers
How to get only the unique results without having to sort data?
$ cat data.txt
aaaaaa
aaaaaa
cccccc
aaaaaa
aaaaaa
bbbbbb
$ cat data.txt | uniq
aaaaaa
cccccc
aaaaaa
bbbbbb
$ cat data.txt | sort | uniq
aaaaaa
bbbbbb
cccccc
$
The result that I need is to display all the lines from the original file removing all…
Lazer
- 34,477
- 25
- 70
- 75
45
votes
2 answers
Common lines between two files
I have the following code that I run on my Terminal.
LC_ALL=C && grep -F -f genename2.txt hg38.hgnc.bed > hg38.hgnc.goi.bed
This doesn't give me the common lines between the two files. What am I missing there?
Marwah Soliman
- 643
- 3
- 7
- 10
43
votes
4 answers
How is uniq not unique enough that there is also uniq --unique?
Here are commands on a random file from pastebin:
wget -qO - http://pastebin.com/0cSPs9LR | wc -l
350
wget -qO - http://pastebin.com/0cSPs9LR | sort -u | wc -l
287
wget -qO - http://pastebin.com/0cSPs9LR | sort | uniq | wc -l
287
wget -qO -…
enfascination
- 551
- 1
- 4
- 7
20
votes
8 answers
case-insensitive search of duplicate file-names
I there a way to find all files in a directory with duplicate filenames, regardless of the casing (upper-case and/or lower-case)?
lamcro
- 893
- 1
- 8
- 12
19
votes
3 answers
Uniq won't remove duplicate
I was using the following command
curl -silent http://api.openstreetmap.org/api/0.6/relation/2919627 http://api.openstreetmap.org/api/0.6/relation/2919628 | grep node | awk '{print $3}' | uniq
when I wondered why uniq wouldn't remove the…
Matthieu Riegler
- 507
- 4
- 7
- 14
18
votes
1 answer
How to remove duplicate lines in a large multi-GB textfile?
My question is similar to this question but with a couple of different constraints:
I have a large \n delimited wordlist -- one word per line. Size of
files range from 2GB to as large as 10GB.
I need to remove any duplicate lines.
The process…
greatwolf
- 283
- 1
- 2
- 8
17
votes
12 answers
Delete duplicate lines pairwise?
I encountered this use case today. It seems simple at first glance, but fiddling around with sort, uniq, sed and awk revealed that it's nontrivial.
How can I delete all pairs of duplicate lines? In other words, if there is an even number of…
Wildcard
- 35,316
- 26
- 130
- 258
15
votes
8 answers
How can I find the most frequent word in a .csv file, ignoring duplicates on each line?
I need to find the 10 most frequent words in a .csv file.
The file is structured so that each line contains comma-separated words. If the same word is repeated more than once in the same line, it should be counted as one.
So, in the example…
ginopino
- 370
- 2
- 10
15
votes
5 answers
Remove adjacent duplicate lines while keeping the order
I have a file with one column with names that repeat a number of times each. I want to condense each repeat into one, while keeping any other repeats of the same name that are not adjacent to other repeats of the same name.
E.g. I want to turn the…
Age87
- 549
- 5
- 11
15
votes
2 answers
What did `uniq -t` do?
I have some old code from 2003 which uses -t option for uniq command. It throws an error since that option is probably not supported anymore.
Here's the piece which uses the command:
egrep -n "{ IA32_OP" ia32-decode.c | \
awk '{ print $1 $3 $4…
Babken Vardanyan
- 875
- 2
- 11
- 15
14
votes
2 answers
Who killed my sort? or How to efficient count distinct values from a csv column
I'm doing some processing trying to get how many different lines in a file containing 160,353,104 lines. Here is my pipeline and stderr output.
$ tail -n+2 2022_place_canvas_history.csv | cut -d, -f2 | tqdm --total=160353104 |\
sort -T. -S1G |…
wviana
- 213
- 1
- 3
- 9
14
votes
3 answers
What is the point of uniq -u and what does it do?
uniq seems to do something different than uniq -u, even though the description for both is "only unique lines".
What's the difference here, what do they do?
user11350058
- 159
- 5
12
votes
5 answers
How to create an array of unique elements from a string/array in bash?
If I have a string "1 2 3 2 1" - or an array [1,2,3,2,1] - how can I select the unique values, i.e.
"1 2 3 2 1" produces "1 2 3"
or
[1,2,3,2,1] produces [1,2,3]
Similar to uniq but uniq seems to work on whole lines, not patterns within a line...
Michael Durrant
- 41,213
- 69
- 165
- 232