5

I have a file of several sections, each section start with specific title but all of them ending with the same string, I want to sort the file sections according to the titles without sorting the content of each section (i.e. take the whole section as one block) there is also a blank line between each two section, to clarify the idea if the input is as

string5
z
y
x
string

string2
x
z
y
f
string

the desired output would be as

string2
x
z
y
f
string

string5
z
y
x
string
αғsнιη
  • 40,939
  • 15
  • 71
  • 114
Mohsen El-Tahawy
  • 887
  • 1
  • 9
  • 27
  • What are the actual values of `string`, `string2` and `string5`, and may they occur as part of the `x`, `z`, etc.? Are any of the lines within a section empty? Are you sorting on the `2` and `5` in `string2` and `string5`, or on the whole string? – Kusalananda Jun 18 '20 at 06:46
  • string2 and string5 are titles containg alot of words strating with R01, R02, R03 ... etc, while the end of each section "string" is just a word – Mohsen El-Tahawy Jun 18 '20 at 06:50

3 Answers3

8

Using GNU sed and sort:

sed 's/^$/\x0/g' file | sort -z | tr '\0' '\n'
  • Put null character in empty line
  • sort using null character as delimiter ( -z)
  • finally replace null delimiter with new line using tr.
  • to remove empty lines in first and last line of the output, you may add | sed '1{/^$/d};${/^$/d}'

Output:

string2
x
z
y
f
string

string5
z
y
x
string

(maybe someone can help making \x0 work for non-GNU sed, related Question)

pLumo
  • 22,231
  • 2
  • 41
  • 66
  • this just add x0 in the blank line keeping the rest as it is – Mohsen El-Tahawy Jun 18 '20 at 07:07
  • 1
    interesting, maybe a GNU vs non-GNU thing again o0 – pLumo Jun 18 '20 at 07:09
  • yep was related to GNU vs non-GNU issue – Mohsen El-Tahawy Jun 18 '20 at 07:13
  • this adds two empty lines between blocks. for `\x0` if you change to `\o0` might be compatible. – αғsнιη Jun 18 '20 at 07:33
  • On OS X, if you want to work around the unsupported `\x0` problem, you can install GNU utils e.g. via Homebrew, and use `gsed` instead of `sed`. – EdwardTeach Feb 03 '22 at 17:16
  • This answer was extremely helpful, thanks! I also had the non-GNU issue when running my script in Linux Alpine. Ended up solving the problem by installing GNU sed, as described in [this solution](https://stackoverflow.com/questions/64957014/alpine-linux-sed-what-is-the-equivalent-of-z-or-null-data-switch-when-usi), in case others face the same issue. – Nathalie Laroche Nov 09 '22 at 16:36
4

Using GNU awk in paragraph mode and sort the array's values in string mode then print:

awk -v RS= '{ seen[NR]=$0 }
END { PROCINFO["sorted_in"]="@val_str_asc";
      for (block in seen) {print sep seen[block]; sep=ORS}
}' infile
αғsнιη
  • 40,939
  • 15
  • 71
  • 114
1

With perl:

perl -l -00 -e '
  chomp(@paragraphs = <>);
  print join "\n\n", sort @paragraphs' your-file
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501