Questions tagged [uniq]

173 questions
168
votes
5 answers

What is the difference between "sort -u" and "sort | uniq"?

Everywhere I see someone needing to get a sorted, unique list, they always pipe to sort | uniq. I've never seen any examples where someone uses sort -u instead. Why not? What's the difference, and why is it better to use uniq than the unique flag to…
Benubird
  • 5,752
  • 10
  • 36
  • 41
104
votes
13 answers

How can I remove duplicates in my .bash_history, preserving order?

I really enjoying using control+r to recursively search my command history. I've found a few good options I like to use with it: # ignore duplicate commands, ignore commands starting with a space export HISTCONTROL=erasedups:ignorespace # keep the…
cwd
  • 44,479
  • 71
  • 146
  • 167
60
votes
4 answers

How to get only the unique results without having to sort data?

$ cat data.txt aaaaaa aaaaaa cccccc aaaaaa aaaaaa bbbbbb $ cat data.txt | uniq aaaaaa cccccc aaaaaa bbbbbb $ cat data.txt | sort | uniq aaaaaa bbbbbb cccccc $ The result that I need is to display all the lines from the original file removing all…
Lazer
  • 34,477
  • 25
  • 70
  • 75
45
votes
2 answers

Common lines between two files

I have the following code that I run on my Terminal. LC_ALL=C && grep -F -f genename2.txt hg38.hgnc.bed > hg38.hgnc.goi.bed This doesn't give me the common lines between the two files. What am I missing there?
Marwah Soliman
  • 643
  • 3
  • 7
  • 10
43
votes
4 answers

How is uniq not unique enough that there is also uniq --unique?

Here are commands on a random file from pastebin: wget -qO - http://pastebin.com/0cSPs9LR | wc -l 350 wget -qO - http://pastebin.com/0cSPs9LR | sort -u | wc -l 287 wget -qO - http://pastebin.com/0cSPs9LR | sort | uniq | wc -l 287 wget -qO -…
enfascination
  • 551
  • 1
  • 4
  • 7
20
votes
8 answers

case-insensitive search of duplicate file-names

I there a way to find all files in a directory with duplicate filenames, regardless of the casing (upper-case and/or lower-case)?
lamcro
  • 893
  • 1
  • 8
  • 12
19
votes
3 answers

Uniq won't remove duplicate

I was using the following command curl -silent http://api.openstreetmap.org/api/0.6/relation/2919627 http://api.openstreetmap.org/api/0.6/relation/2919628 | grep node | awk '{print $3}' | uniq when I wondered why uniq wouldn't remove the…
Matthieu Riegler
  • 507
  • 4
  • 7
  • 14
18
votes
1 answer

How to remove duplicate lines in a large multi-GB textfile?

My question is similar to this question but with a couple of different constraints: I have a large \n delimited wordlist -- one word per line. Size of files range from 2GB to as large as 10GB. I need to remove any duplicate lines. The process…
greatwolf
  • 283
  • 1
  • 2
  • 8
17
votes
12 answers

Delete duplicate lines pairwise?

I encountered this use case today. It seems simple at first glance, but fiddling around with sort, uniq, sed and awk revealed that it's nontrivial. How can I delete all pairs of duplicate lines? In other words, if there is an even number of…
Wildcard
  • 35,316
  • 26
  • 130
  • 258
15
votes
8 answers

How can I find the most frequent word in a .csv file, ignoring duplicates on each line?

I need to find the 10 most frequent words in a .csv file. The file is structured so that each line contains comma-separated words. If the same word is repeated more than once in the same line, it should be counted as one. So, in the example…
ginopino
  • 370
  • 2
  • 10
15
votes
5 answers

Remove adjacent duplicate lines while keeping the order

I have a file with one column with names that repeat a number of times each. I want to condense each repeat into one, while keeping any other repeats of the same name that are not adjacent to other repeats of the same name. E.g. I want to turn the…
Age87
  • 549
  • 5
  • 11
15
votes
2 answers

What did `uniq -t` do?

I have some old code from 2003 which uses -t option for uniq command. It throws an error since that option is probably not supported anymore. Here's the piece which uses the command: egrep -n "{ IA32_OP" ia32-decode.c | \ awk '{ print $1 $3 $4…
Babken Vardanyan
  • 875
  • 2
  • 11
  • 15
14
votes
2 answers

Who killed my sort? or How to efficient count distinct values from a csv column

I'm doing some processing trying to get how many different lines in a file containing 160,353,104 lines. Here is my pipeline and stderr output. $ tail -n+2 2022_place_canvas_history.csv | cut -d, -f2 | tqdm --total=160353104 |\ sort -T. -S1G |…
wviana
  • 213
  • 1
  • 3
  • 9
14
votes
3 answers

What is the point of uniq -u and what does it do?

uniq seems to do something different than uniq -u, even though the description for both is "only unique lines". What's the difference here, what do they do?
user11350058
  • 159
  • 5
12
votes
5 answers

How to create an array of unique elements from a string/array in bash?

If I have a string "1 2 3 2 1" - or an array [1,2,3,2,1] - how can I select the unique values, i.e. "1 2 3 2 1" produces "1 2 3" or [1,2,3,2,1] produces [1,2,3] Similar to uniq but uniq seems to work on whole lines, not patterns within a line...
Michael Durrant
  • 41,213
  • 69
  • 165
  • 232
1
2 3
11 12