12

If I have a string "1 2 3 2 1" - or an array [1,2,3,2,1] - how can I select the unique values, i.e.

"1 2 3 2 1" produces "1 2 3" 

or

[1,2,3,2,1] produces [1,2,3]

Similar to uniq but uniq seems to work on whole lines, not patterns within a line...

Michael Durrant
  • 41,213
  • 69
  • 165
  • 232

5 Answers5

12

If you are using zsh:

$ array=(1 2 3 2 1)
$ echo ${(u)array[@]}
1 2 3

or (if KSH_ARRAYS option is not set) even

$ echo ${(u)array}
1 2 3
jimmij
  • 46,064
  • 19
  • 123
  • 136
  • 1
    If the array may contain empty elements, you should use `"${(u)array[@]}"` or `"${(@u)array}"` instead (note the quotes). – Stéphane Chazelas Nov 10 '14 at 19:49
  • I am using _zsh 5.1.1 (x86_64-ubuntu-linux-gnu)_, and `${(u)array}` works even if the array is empty or contains an empty string, without quotes. – apaderno Jul 01 '17 at 00:06
6

With GNU awk (this also retains original order)

printf '%s\n' "1 2 3 2 1" | awk -v RS='[[:space:]]+' '!a[$0]++{printf "%s%s", $0, RT}'
1 2 3 

To read into a bash array

read -ra arr<<<$(printf '%s\n' "1 2 3 2 1" |
 awk -v RS='[[:space:]]+' '!a[$0]++{printf "%s%s", $0, RT}')
printf "%s\n"  "${arr[@]}"
1
2
3
iruvar
  • 16,515
  • 8
  • 49
  • 81
5

For an array with arbitrary values, it's quite tricky with bash as it doesn't have a builtin operator for that.

bash however happens not to support storing NUL characters in its variables, so you can make use of that to pass that to other commands:

The equivalent of zsh's:

new_array=("${(@u}array}")

on a recent GNU system, could be:

eval "new_array=($(
  printf "%s\0" "${array[@]}" |
    LC_ALL=C sort -zu |
    xargs -r0 bash -c 'printf "%q\n" "$@"' sh
  ))"

Alternatively, with recent versions of bash, and assuming none of the array elements are empty, you could use associative arrays:

unset hash
typeset -A hash
for i in "${array[@]}"; do
  hash[$i]=
done
new_array=("${!hash[@]}")

With bash 4.4 and newer and with GNU sort:

readarray -td '' new_array < <(
  printf '%s\0' "${array[@]}" | LC_ALL=C sort -zu)

The order of the elements would not be the same in those different solutions.

With tcsh:

set -f new_array = ($array:q)

Would retain the first element (a b a => a b) like zsh's (u) expansion flag.

set -l new_array = ($array:q)

Would retain the last (a b a => b a). Those however remove empty elements from the array.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
1

This solution worked for me.

ids=(1 2 3 2 1)
echo "${ids[@]}" | tr ' ' '\n' | sort -u | tr '\n' ' '

The above produces 1 2 3 as the output.

Shorter version as suggested by Costas could be,

printf "%s\n" "${ids[@]}" | sort -u | tr '\n' ' '

To store the end results to an array, you could do something like,

IFS=$' '
arr=($(printf "%s\n" "${ids[@]}" | sort -u | tr '\n' ' '))
unset IFS

Now, when I do an echo on arr, this is the output I get.

echo "${arr[@]}"
1 2 3

References

https://stackoverflow.com/a/13648438/1742825 https://stackoverflow.com/a/9449633/1742825

Ramesh
  • 38,687
  • 43
  • 140
  • 215
0

To do it entirely in the shell and put the result in an array,

declare -A seen
for word in one two three two one
do
        if [ ! "${seen[$word]}" ]
        then
                result+=("$word")
                seen[$word]=1
        fi
done
echo "${result[@]}"

In words: if we haven’t seen a given word yet, add it to the result array and flag it as having been seen.  Once a word has been seen, ignore subsequent appearances of it.

  • 2
    Note that you need `unset seen` before `declare -A seen` in case `$seen` was previously defined (even as a scalar variable from the environment). – Stéphane Chazelas Nov 10 '14 at 20:10