I have a file with the following lines:
1 a
2 a
3 a
1 b
2 b
1 c
2 c
3 c
4 c
1 d
I want to get the result as:
a 1 2 3
b 1 2
c 1 2 3 4
d 1
I have a file with the following lines:
1 a
2 a
3 a
1 b
2 b
1 c
2 c
3 c
4 c
1 d
I want to get the result as:
a 1 2 3
b 1 2
c 1 2 3 4
d 1
Using awk:
awk '{ group[$2] = (group[$2] == "" ? $1 : group[$2] OFS $1 ) }
END { for (group_name in group) print group_name, group[group_name] }' inputfile
This stores the groups in an array called group. This array is indexed on the group name (the second column in the input data) and for each line of input from inputfile, the value in the first column is appended to the correct group.
The END block loops over all collected groups and outputs the group name and the entries of that group.
This awk program with a nicer layout:
{
group[$2] = (group[$2] == "" ? $1 : group[$2] OFS $1 )
}
END {
for (group_name in group)
print group_name, group[group_name]
}
Note that this is not what you'd want to do if you have massive amounts of data as the group array will actually store all input data read from the file.
For huge amounts of data, we assume that the input is sorted on the group names (the second column) and use
awk '$2 != group_name { if (group != "") print group_name, group; group = ""; group_name = $2 }
{ group = (group == "" ? $1 : group OFS $1) }
END { if (group != "") print group_name, group }' inputfile
This keeps track of what the current group is, and collects the data for that group. Whenever the second column in the input switches to another value, it outputs the collected group data and starts collecting new data. This means that only a few lines of input is ever stored, rather than storing the whole input data set.
This last awk program with a nicer layout:
$2 != group_name {
if (group != "")
print group_name, group
group = ""
group_name = $2
}
{
group = (group == "" ? $1 : group OFS $1)
}
END {
# Output last group (only), if there was any data at all.
if (group != "")
print group_name, group
}
Try this,
for i in `awk '!a[$2]++ { print $2}' file.txt`
do
echo "$i `awk -v z=$i '$2==z{print $1}' file.txt | tr '\n' ' '`"
done
awk '!a[$2]++ { print $2} will give the unique value of column 2.$2==z{print $1} will print all values where $2 equals variable z. Command:for i in a b c d; do echo $i;awk -v i="$i" '$2 == i{print $1}' filename| perl -pne "s/\n/ /g";echo " "| perl -pne "s/ /\n/g";done| sed '/^$/d'| sed "N;s/\n/ /g"
output
for i in a b c d; do echo $i;awk -v i="$i" '$2 == i{print $1}' l.txt | perl -pne "s/\n/ /g";echo " "| perl -pne "s/ /\n/g";done| sed '/^$/d'| sed "N;s/\n/ /g"
a 1 2 3
b 1 2
c 1 2 3 4
d 1