7

Is it possible to remove the delimiter with csplit? Example:

$ cat in
abc
---
def
---
ghi
$ csplit -q in /-/ '{*}'
$ ls x*
xx00  xx01  xx02
$ head xx*
==> xx00 <==
abc

==> xx01 <==
---
def

==> xx02 <==
---
ghi

Instead of what it did, i.e. split and keep the delimiter, can it be asked to split and remove the delimiter?

That is, the desired output would be this:

$ sed -i '/-/d' xx*
$ head xx*
==> xx00 <==
abc

==> xx01 <==
def

==> xx02 <==
ghi

While it can be done in two steps as above, can it be done in one step?

If it cannot be done with csplit, is there a one-step way that is shorter compared to the two invocations (csplit + sed) above? No preference to a tool used as long as it's reasonably readable.

levant pied
  • 231
  • 1
  • 7

3 Answers3

6

Since you seem to be using gnu csplit, it's quite simple:

csplit --suppress-matched infile /PATTERN/ '{*}'

i.e. use --suppress-matched to suppress the lines matching PATTERN.


Per your note, this option is available only with more recent versions of csplit (coreutils ≥ 8.22)

don_crissti
  • 79,330
  • 30
  • 216
  • 245
  • 1
    Ha! The distribution that I'm on uses coreutils 8.4 (yes, ancient), which looks like it doesn't support that option... It was added in 2013: http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=4114c93af398d7aecb5eb253f90d9b4cc0785643, so that was included in 8.22. Thanks for the hint though, will become useful with the upgrade I guess! – levant pied May 10 '16 at 18:31
2
perl -ne 'BEGIN { $fnum=0; open $fh, ">", sprintf "xx%02d", $fnum++ } if (m/-/) { open $fh, ">", sprintf "xx%02d", $fnum++ } else { print $fh $_ }' inputfileorfileshere

Or a similar reopen-into-new-file-on-matching-appropriate-line via awk or whatever.

thrig
  • 34,333
  • 3
  • 63
  • 84
2

If you can make do with a string match rather than a regex match

awk 'BEGIN {RS="---\n"; ORS=""} {print > sprintf("xx%02d", NR)}' in

With GNU awk (at least in v4.0.1) it is possible to use a regex for RS e.g.

gawk 'BEGIN {RS="-+\n"; ORS=""} {print > sprintf("xx%02d", NR)}' in
steeldriver
  • 78,509
  • 12
  • 109
  • 152