1

I have a csv file with seven numbers per line like this:

1083,20,28,42,23,10,43
1084,20,5,29,59,40,33
1085,39,50,21,12,40,55
1086,45,4,6,23,10,2
1087,36,46,28,32,3,20

I want to keep the first number in place (column 1) and sort columns 2 to 7, making the file like

1083,10,20,23,28,42,43
1084,5,20,29,33,40,59
1085,12,21,39,40,50,55
1086,2,4,6,10,45,23
1087,3,20,28,32,36,46

How can I do that with awk, sed or whatever?

Thanks

Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
Duck
  • 4,434
  • 19
  • 51
  • 64

4 Answers4

2

With perl:

perl -F, -lape '$_ = join ",", shift @F, sort {$a <=> $b} @F' < input.csv

With GNU awk:

gawk -F, '
  {
    split($0, a)
    printf "%s", a[1]
    delete a[1]
    n = asort(a)
    for (i = 1; i <= n ; i++) printf "%s", FS a[i]
    print ""
  }' < input.csv

Or using the join.awk library (thanks @WeijunZhou)

gawk -i join -F, -v OFS=, '
  {
    split($0, a)
    first = a[1]
    delete a[1]
    n = asort(a)
    print first, join(a, 1, n, OFS)
  }'
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • thanks but that is destroying part of the file. Some lines end like this `,47,50,5519` – Duck Feb 26 '19 at 10:12
  • 1
    @SpaceDog, sounds like your file has CRLF MS-DOS line endings, convert it to Unix first (using `dos2unix` for instance) or insert a `s/\r$//;` at the start of the `perl` code above. – Stéphane Chazelas Feb 26 '19 at 10:15
  • I don't gave gawk. I am no macOS. Your perl command is not working even on the unix converted file. – Duck Feb 26 '19 at 10:27
  • In your `gawk` code, `-F,` is not needed. Assuming nonneg numbers, with recent `gawk`, `gawk -F, '@include "join";{split($0,a,",");a[1]=0;asort(a);print $1","join(a,2,7,",")}'` – Weijun Zhou Feb 26 '19 at 10:29
  • @WeijunZhou - unfortunately I do not have gawk on macOS. – Duck Feb 26 '19 at 10:32
  • Then you need to implement `asort` yourself. Using `perl` is much simpler but I am not familiar with it. – Weijun Zhou Feb 26 '19 at 10:37
  • @SpaceDog, you can install it using [Homebrew](https://brew.sh/) – glenn jackman Feb 26 '19 at 17:35
1

The problem you are facing is you have files with MAC line endings (\r) whereas the commands you are running are assuming Unix line endings (\n).

Use Perl:

$ perl -l015 -F, -0015 -pae '$_ = join ",", shift @F, sort { $a <=> $b } @F' input.csv

Options:

  • -l sets ORS to \r = octal 15.
  • -0 sets RS to \r = octal 15.
  • -p will loop thru your csv file on a per record basis. A record is split from the input file by the RS, which is \r as set by the -0 option.
  • -F will set the field separator to comma for each record as it is read.
  • -a will split each record as it is being read into fields and stored in array @F.
  • Note: the order of supplying the options is important.

So the output that you will get shall also have MAC line endings. That is why you are NOT seeing your expected output, rather all your output records appear squashed into a single line based on all the solutions given here.

  • Now we are talking. Your answer is perfect and is the only one that works for me and the only one that fits what I was saying. Cannot upvote you enough. THANKS! – Duck Feb 27 '19 at 12:00
-2

For simple csv without quoting, try this:

while IFS= read -r l; do
    col1=$(printf '%s' "$l" | cut -d, -f1)
    printf '%s,' "$col1"
    printf '%s' "$l" | cut -d, -f2- | grep -o "[0-9]*" | sort -n | paste -sd,
done < file
pLumo
  • 22,231
  • 2
  • 41
  • 66
  • file is completely destroyed, like this... 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,... – Duck Feb 26 '19 at 10:42
  • 1
    Works fine for me on your given test file. – Weijun Zhou Feb 26 '19 at 10:47
  • sorry, not working for me at all. – Duck Feb 26 '19 at 11:17
  • @SpaceDog, what does `file your-file.csv` return? – Stéphane Chazelas Feb 26 '19 at 11:18
  • I have two versions of the file, one unix and one windows. [Here](https://ufile.io/5angu) is the unix version of the file if you want to test. – Duck Feb 26 '19 at 11:24
  • for me it works just as I think it should using your "unix" version of the file ... Don't know what you're doing. And don't know why these downvotes ... – pLumo Feb 26 '19 at 12:04
  • 1
    About one of the downvotes, see [Why is using a shell loop to process text considered bad practice?](//unix.stackexchange.com/q/169716), [Security implications of forgetting to quote a variable in bash/POSIX shells](//unix.stackexchange.com/q/171346), [Why is printf better than echo?](//unix.stackexchange.com/q/65803) (also the part about the first argument of `printf` being the _format_). – Stéphane Chazelas Feb 26 '19 at 15:12
  • thanks, I changed the printf and variables not quoted part ... – pLumo Feb 26 '19 at 15:17
-2

Tried with below method combination of awk and sed

j=awk '{print NR}' filename | sort -nr| sed -n '1p'

for ((i=1;i<=$j;i++)); do k=`awk -F "," -v  i="$i" 'NR==i {print $1}' i.txt` ;echo $k;awk -F "," -v i="$i" 'NR==i {$1=" ";print $0}' i.txt| sed -r "s/^\s+//g"| perl -pne "s/ /\n/g"| sort -n| perl -pne "s/\n/ /g";echo " "| sed "s/.*/&\n/g";done|sed '/^$/d'| sed "N;s/\n/ /g"| sed "s/ /,/g"| sed "s/\,*$//g"

output

1083,10,20,23,28,42,43
1084,5,20,29,33,40,59
1085,12,21,39,40,50,55
1086,2,4,6,10,23,45
1087,3,20,28,32,36,46
Praveen Kumar BS
  • 5,139
  • 2
  • 9
  • 14
  • I do not even know how to use this... You have a variable declared on the first line (?) then a second line... ??? – Duck Feb 26 '19 at 11:06