8

We are having a file in Linux which contains one line per record, but problem comes when the line contains some new line characters. In this case, the backslash is appended at the end of line and the record is split into multiple lines. So below is my problem:

"abc def \
xyz pqr"

should be:

"abc def xyz pqr"

I tried sed -I 's/\\\n/ /g' <file_name> which is not working. I also tried the tr command but it replaces only one character, not the string. Can you please suggest any command to handle this issue.

HalosGhost
  • 4,732
  • 10
  • 33
  • 41
Deepak
  • 83
  • 1
  • 3

5 Answers5

9

You should be able to use

sed -e :a -e '/\\$/N; s/\\\n//; ta'

See Peter Krumins' Famous Sed One-Liners Explained, Part I, 39. Append a line to the next if it ends with a backslash "\".

steeldriver
  • 78,509
  • 12
  • 109
  • 152
3

The shortest solution seems to be with perl:

perl -pe 's/\\\n//'
vinc17
  • 11,912
  • 38
  • 45
3

You can use awk:

$ awk 'sub(/\\$/,""){printf("%s", $0);next};1' file
"abc def xyz pqr"
cuonglm
  • 150,973
  • 38
  • 327
  • 406
2
while read twolines
do printf %s\\n "$twolines"
done <file

...which is what I suspect was the intended destiny for that file anyway. With sed you might do:

sed 'N;s/\([^\\]\)\\\n/\1/;P;D' <file

...which would at least protect backslash quoted quotes, though it misses backslash quoted backslash quoted quotes. Yeah, it's kind of a nightmare regex-wise, but the while read...done thing handles all of those cases properly. Admittedly, though, with some adaptation, steeldriver's solution could reliably handle all cases, too, because the t command can recurse once per successful substitution.

Still, if that's not a problem then:

sed 'N;s/\\\n//;P;D' <file

...does the job.

mikeserv
  • 57,448
  • 9
  • 113
  • 229
  • This only process two consecutive lines. Try `str=$'foo bar \\\nbash 1 \\\nbash 2 \\\nbash 3 \\\nbash 4 \\\nbaz\ndude \\\nhappy\nxxx\nvvv 1 \\\nvvv 2 \\\nCCC'; echo "$str" | sed 'N;s/\([^\\]\)\\\n/\1/;P;D'`. –  Aug 02 '18 at 13:32
  • @Isaac - yeah. to do more than that the `t` command is referred for pattern-space recursion. the `N;...P;D` sliding window solutions are *usually* tailored for outside max handles - theyre often a good means of optimizing a test loop... though a combination of the two techniques with the `t`est laid out *ahead* of either or both of the `s`ubstitution and `N`/`n`ext line grab as applied after a `D`elimited deletion can arbitrarily recurse nested/differed quote contexts at minimal retainer for streamed ops. essentially a bookmarked mem jump... anyway, your `$str` doesnt have any double-quotes. – mikeserv Aug 04 '18 at 22:34
1

Another awk variation

awk '{ORS = /\\/? "": RS; sub(/\\$/, ""); print}' file
iruvar
  • 16,515
  • 8
  • 49
  • 81