134

I need to be able to alphabetically sort the output of find before piping it to a command. Entering | sort | between didn't work, so what could I do?

find folder1 folder2 -name "*.txt" -print0 | xargs -0 myCommand
Flimm
  • 3,970
  • 7
  • 28
  • 36
Industrial
  • 1,771
  • 4
  • 13
  • 12

6 Answers6

103

Use find as usual and delimit your lines with NUL. GNU sort can handle these with the -z switch:

find . -print0 | sort -z | xargs -r0 yourcommand
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
Oli
  • 15,808
  • 10
  • 42
  • 51
  • 1
    It does not seem to work with `find . -name '*.dat' -type f -printf '%f\n' | sort -z | xargs -r0 > output.txt`. Is my line wrong due to the printf? – bomben Nov 24 '20 at 17:47
  • @Ben you're not using -print0 and are introducing newlines instead of NULLs. – ychaouche May 24 '21 at 13:29
  • or together with formating the output `find . -printf "%y %p \n\0" | sort -z` – BMWW Jul 09 '21 at 15:38
67

Some versions of sort have a -z option, which allows for null-terminated records.

find folder1 folder2 -name "*.txt" -print0 | sort -z | xargs -r0 myCommand

Additionally, you could also write a high-level script to do it:

find folder1 folder2 -name "*.txt" -print0 | python -c 'import sys; sys.stdout.write("\0".join(sorted(sys.stdin.read().split("\0"))))' | xargs -r0 myCommand

Add the -r option to xargs to make sure that myCommand is called with an argument.

Arcege
  • 22,287
  • 5
  • 56
  • 64
  • Good one (two?)... Interestingly, though, the two methods handle `.` differently... With `sort` it winds up at the end of the list... with `python` it sorts to the top. (maybe python sorts with `LC_COLLATE=C`) – Peter.O Mar 16 '12 at 14:45
  • There is also the `-t \0` option for sort (which is a `-z` synonym) – Javier Aug 10 '15 at 18:44
  • 1
    The problem with all these `|sort` solutions is that you cannot use `-exec` any longer. OK, although it is possible to rewrite your statement given to `-exec` so that it works with `xargs`, the question is, __what about "mini-scripts"__? (`sh -c ...`) I wouldn't call that trivial to transform a 'sh -c' mini-script with *multiple* commands so that it can work with `xargs` (if possible at all, that is) – syntaxerror Nov 20 '15 at 19:57
  • @syntaxerror: What problem do you have using sh -c with xargs? `printf %s\\n a b c d e | xargs -n3 sh -c 'printf %s, "$@"; printf \\n' x` – Roger Pate Aug 24 '16 at 18:11
  • `-t \0` is not the same as `-z`. `-t` is for field separator, not for line delimiter. – graywolf Jan 22 '20 at 15:05
10

I think you need the -n flag for sort#

According to man sort:

-n, --numeric-sort
    compare according to string numerical value

edit

The print0 may have something to do with this, I just tested this. Take the print0 out, you can null terminate the string in sort using the -z flag

whoami
  • 3,750
  • 7
  • 27
  • 26
  • Well, that `print0` appears to be space-separating the filenames which is what I need to pass to my command, unfortunately – Industrial Mar 16 '12 at 10:46
5

If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:

find folder1 folder2 -name "*.txt" -print | 
  sort |
  parallel myCommand

You can install GNU Parallel simply by:

wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem

Watch the intro videos for GNU Parallel to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Ole Tange
  • 33,591
  • 31
  • 102
  • 198
  • 3
    What is the justification for using GNU Parallel? To speed it up? – Peter Mortensen Sep 28 '14 at 00:18
  • That and you do not need to mess with \0 separated records. – Ole Tange Sep 28 '14 at 16:46
  • 1
    I don't understand that last statement. I create a file with a line break in the file name and execute your command: `cd /tmp && touch $'a\nz' && ls && find -maxdepth 1 -print | sort | parallel echo`. Total false output. I know GNU Parallel now, but that answer misses the original question, doesn't it? – uav Jun 11 '20 at 14:49
  • 1
    I know that it is bad practice to use crazy characters in file names - I am already including the blank space. I just see that parallel has a -0 parameter. Nice. No downvote. `find -maxdepth 1 -print0 | sort -z | parallel -0 echo`. – uav Jun 11 '20 at 15:01
  • 1
    @uav In my 25 years of sysadmin I have never seen a user making a file with \n. I *have* seen plenty of files with ' space and ". So unless you have evil users or a filesystem with error, I will reckon you will not meet a file with \n that was not made by a fellow sysadm. – Ole Tange Jun 11 '20 at 20:46
  • The original question was about print0. print0 used as separator \0 instead of line breaks. Why does print0 exist? I think in order to have a safe separator and thus be able to handle all the crazy characters. I know you know that. `\n` was just an example. You answer with print. Kinda missed the point. The main thing is to advertise. By the way: `echo 'will cite' | parallel --citation 1>/dev/null 2>/dev/null`. To get rid of that annoying citation message. – uav Jun 12 '20 at 09:38
4

Some implementation of find supports ordered traversal directly via the -s parameter:

$ find -s . -name '*.json'

From the FreeBSD find man page:

-s       Cause find to traverse the file hierarchies in lexicographical
         order, i.e., alphabetical order within each directory.  Note:
         `find -s' and `find | sort' may give different results.
raychi
  • 1,161
  • 7
  • 4
2

Some solutions here don't work correctly because the sort command takes the full "path" string to sorting instead of the filename string.

This is a quite complicated but working example of natural sorting results of the "find" command:

find every_minute -type f -name "*.sh" -printf '%f\t%p\n' | sort -V -k1 | cut -d$'\t' -f2 | tr '\n' '\0' | xargs -r0 -I {} echo 'Found: "{}"'

Result:

Found: "every_minute/api/1_build_synonyms.sh"
Found: "every_minute/search_module/2_rotate_index.sh"
Found: "every_minute/api/3_check_synonyms.sh"
Found: "every_minute/api/4_run_schedule.sh"
Found: "every_minute/search_module/10_test.sh"

Example of an invalid find every_minute -type f -name "*.sh" | sort -z | xargs -r0 echo command result:

every_minute/api/1_build_synonyms.sh
every_minute/api/3_check_synonyms.sh
every_minute/api/4_run_schedule.sh
every_minute/search_module/10_test.sh
every_minute/search_module/2_rotate_index.sh

Based on this answer.

James Bond
  • 131
  • 2