24

I have a bunch of .text files, most of which end with the standard nl.

A couple don't have any terminator at end. The last physical byte is (generally) an alphameric character.

I was using cat *.text >| /tmp/joined.text, but then noticed a couple of places in joined.text where the first line of a file appeared at the end of the last line of a previous file. Inspecting the previous file, I saw there wasn't a line terminator -- concatenation explained.

That raised the question, what's the easiest way to concatenate, sticking in the missing newline? What about these options?

  1. A solution that might effectively add a blank line to some input files. For me, that's not a problem as the processing of joined.text can handle it.
  2. A solution that adds the cr/fl only to files that do not already end that way.
Giacomo1968
  • 197
  • 1
  • 10
HiTechHiTouch
  • 881
  • 1
  • 7
  • 15
  • 1
    Safest is to add the missing newline e.g. http://unix.stackexchange.com/questions/31947/how-to-add-a-newline-to-the-end-of-a-file?rq=1 Totally unsafe is leaving those broken files around then wondering why a shell `while` is skipping those broken last lines. – thrig Feb 16 '17 at 19:25
  • Do you really want a cr/lf or do you want the normal, standard `\n`? On *nix systems, lines end with a single `\n`. The `\r\n` is a Windows thing. And where do you want this? At the end of each line? The end of the file? – terdon Feb 16 '17 at 19:45
  • @thrig But which specific files? In other words, what's a good way to automatically identify them (instead of opening each and every candidate)? And if another one gets accidentally generated, then an automated method would be extra nice! – HiTechHiTouch Feb 16 '17 at 19:46
  • @terdon Thanks for the catch. My windows heritage shows... The nl goes only at the end of a file that doesn't have one. Each lines in a multi-line file ends with nl, except for the last. Probably because some editor dropped it. – HiTechHiTouch Feb 16 '17 at 19:48
  • @terdon that idea would work for Option 1, however the way I read the find man, '%s\n' would append the size of the file. Probably want just '\n'? – HiTechHiTouch Feb 16 '17 at 19:55
  • This has nothing to do with `find`; yes, in `find`, the `%s` of `printf` is the size of the file. But that's a peculiarity of `find`. The `printf` utility is very standard and exists (behaving more or less the same way) in shells and most programming languages. There, `printf '%s' foo` will just replace the `%s` with `foo` and print it. – terdon Feb 16 '17 at 21:06

5 Answers5

36

Another command that can add newlines if needed is awk, so:

awk 1 ./*.txt

The 1 here is the simplest way to get a true condition in awk, which works for this purpose since awk default action on true conditions is to print the input lines.

Braiam
  • 35,380
  • 25
  • 108
  • 167
muru
  • 69,900
  • 13
  • 192
  • 292
  • 1
    Hi @muru, can u explain a bit what does "awk 1" mean? – Jon Jul 03 '18 at 01:46
  • 6
    @Jon awk's default action on true conditions is to print the input lines, and `1` is the simplest true condition. It's shorthand for `awk '{print}'` – muru Jul 03 '18 at 01:52
5

With some cut implementations like GNU cut, you can do:

cut -b 1- ./*.text > output

as it will add the missing newline if missing.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
4

This handy Perl one-liner can do the job of adding the missing newline only if not already there:

perl -lpe '' ./*.text > output
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
Rakesh Sharma
  • 770
  • 4
  • 4
3

You could use this:

grep -h "" ./*.txt

-h will remove the filename printout

Wadih M.
  • 1,626
  • 1
  • 15
  • 23
1

The first approach that comes to mind is to loop over the files and just print their contents with an appended newline:

for f in *text; do
    printf '%s\n' "$(cat < "$f")"
done > /tmp/joined.text

The $() will strip any already existing newline characters so this will result in just one \n at the end of each file.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
terdon
  • 234,489
  • 66
  • 447
  • 667
  • Don't want to strip existing NLs -- that would just run all the lines together, compounding my problem. What I hear you telling me is for Option 1, just loop through all the files, printing each one then a NL. I'm noobie surprised that isn't something in an existing utility to force a new line when necessary so lines don't run together. – HiTechHiTouch Feb 16 '17 at 20:03
  • @HiTechHiTouch this will both remove any existing `\n` *and* add one. The result will always be one (and only one) `\n` at the end of each file. The `%s` is a `printf` thing, it just means "string". See [here](https://en.wikipedia.org/wiki/Printf_format_string). You are confusing it with the `[ -s file ]` which is the size of the file. This does both option 1 and option 2. As for a utility, no there isn't because any program that writes to a file always adds a newline. If there isn't one, that is almost always because something broke and the file is corrupted. – terdon Feb 16 '17 at 21:04
  • 1
    Note that it adds an empty line for empty files (or files that can't be opened for reading). In shells other than `zsh`, it will choke on NUL characters. It should probably also be noted that it loads the whole files in memory. – Stéphane Chazelas Jan 30 '18 at 09:49