How to print lines between same pattern with muliple occurrence?

Question

Need to extract the lines between the same pattern with specified occurrence of the search Pattern

like if I want to get the lines between 1st & 2nd occurrence or lines between 3rd and 4th occurrence of search pattern. Where no of lines may very between the pattern, if no lines between pattern then output should be blank

Example: -

Line 1
Line 2
Line 3
Pattern
Line 5
Line 6
Line 7
Pattern
Line 8
Line 9
Pattern
Line 11
Line 12
Pattern
Line 13
Pattern
Pattern

Expected Output Lines Between 1st and 2nd Occurrence

Line 5
Line 6
Line 7

Lines between 3rd and 4th occurrence

Line 11
Line 12

https://unix.stackexchange.com/search?q=awk+lines+between+pattern — Sparhawk, May 18 '17 at 06:52
[Third hit](https://unix.stackexchange.com/questions/346203/grep-the-lines-between-the-occurrence-of-the-same-pattern). Although I'm not sure what your expected output is exactly. You have two code blocks as output. Do they represent separate files? — Sparhawk, May 18 '17 at 07:02
Yes Output would be in different files and the input would be a single file. — Vasanta Koli, May 18 '17 at 07:03
I just realised it's slightly different to the answer I linked. I've posted some modified code. — Sparhawk, May 18 '17 at 07:11
Hi I saw the post but challenge is it is creating multiple files, where I need specific output of lines on which I wold want to perform text processing further, before printing it to any of the file — Vasanta Koli, May 18 '17 at 07:15
Well, you'll have to be more specific about what "further" text processing is. How would you do that precisely? Can you put that into the awk code? Can you write to temporary files, then iterate across those? — Sparhawk, May 18 '17 at 07:16

score 4 · Accepted Answer · answered May 18 '17 at 07:10

4

Based on this answer,

awk '/Pattern/{n+=1}; n % 2 == 1 && ! /Pattern/ {print > "output"((n-1)/2)}' input_file

Explanation

/Pattern/{n+=1}: when you match Pattern, increment n by 1.
n % 2 == 1 && ! /Pattern/: only do the following then n is odd, i.e. after every alternate pattern. Also, ignore the lines with Pattern on them.
{print > "output"((n+1)/2)}': if the above holds, then print that line into a file named outputx, where x is ((n+1)/2), i.e. output1, output2, output3…

answered May 18 '17 at 07:10

Sparhawk

19,561
18
86
152

Sure let me try this one and update you on the same – Vasanta Koli May 18 '17 at 07:17
This is working best for me Thanks for your help, I'll need to just pickup the files with the numbers, Above creates files with output0, output1 etc. in case the contents between pattern has no line, no output file is skipped where I can get numbers perfectly to work on, by taking exact number of output file – Vasanta Koli May 18 '17 at 07:23
No worries. If this answers your question (I'm not 100% sure if it does), please click the tick mark on the left. – Sparhawk May 18 '17 at 07:26
Already Done :P – Vasanta Koli May 18 '17 at 07:27
I think that was the upvote arrow, not the tick? :p – Sparhawk May 18 '17 at 07:27

Sergiy Kolodyazhnyy · Answer 2 · 2017-05-18T07:51:27.327

Alternative AWK approach

 $ awk -v start=3  '/Pattern/{n++;next};n==start;n==start+1{exit}' input.txt                                                     
Line 11
Line 12

$ awk -v start=2 '/Pattern/{n++;next};n==start;n==start+1{exit}' input.txt                                                      
Line 8
Line 9

Explanation

The way this works is fairly straightforward:

using -v flag we define a variable which we increment if we find the matching pattern and go to next line(that's the /Pattern/{n++;next} part of the code)
in awk if condition is true, that's automatically a signal for printing stuff, hence n==start can be viewed same was as n==start{print}.
final codeblock where we look if we got to the next pattern is n==start+1{exit}. Say we wanted to print lines between 4th and 5th pattern occurrence. This will mean that whenn==4+1` the code exits

If we were doing code-golf, we could make this even shorter by just changing start variable to something like -v s=1, which shortens the code like so:

awk -v s=3  '/Pattern/{n++;next};n==s;n==s+1{exit}'

Assumptions:

GNU awk
we're reading between consecutive patterns, i.e. between match n and n+1

Generalizing the approach

What if we wanted to print lines between pattern 2 to pattern 4 ? Using a few of tricks used in the previous example, we can do that as well like so:

$ awk -v start=2 -v finish=4 '/Pattern/{n++;next};n==finish{exit};n>=start' input.txt                                           
Line 8
Line 9
Line 11
Line 12

Notice that here we define another variable,finish, to know where to stop. This way n==finish will stop printing the lines. Notice also that n==finish{exit} comes before n>=start, which allows us to avoid redundant printing of the same line where we're supposed to exit.

score 0 · Answer 3 · answered May 18 '17 at 07:26

0

With sed:

sed -n '/Pattern/!d;:a
n;//! {w file1.txt
ba
};:b 
n;//! bb
:c
n;//q;w file2.txt
bc
' file

WIth POSIX sed You have to do 3 loops like this for the both matches and the in-between, as you can't generate filenames from within the script.

answered May 18 '17 at 07:26

Philippos

13,237
2
37
76

score 0 · Answer 4 · answered May 18 '17 at 10:30

start=3; # these can only be positive integers
 stop=4; # stop > start

perl -lne "// or print if /Pattern/ && ++\$a == $start ... // && ++\$a == $stop" data.in

Perl solution uses the range operator ... where it's two operands act like flip-flops: => so long as the first operand is false, ... returns false. As soon as the first operand goes true, then the ... returns true. It will only go false when the second operand becomes true. The subtleyty arises due to the feature that the operand1 is not evaluated once it becomes true and operand2 is not evaluated while operand1 is false.

sed -nE "
   /Pattern/!d
   x
      s/\$/./
      /^[.]{$start}\$/!{x;d;}
   x

   n

   :loop
      p;n
      /Pattern/{
         x
            s/\$/./
            /^.{$stop}\$/q
         x
      }
   bloop
" data.in

the sed solution uses the hold space for keeping a count of the number of times the pattern is seen. We keep rejectting lines so long as $start number of patterns not seen. As soon as the $start-th pattern arrives, we go into a loop which keeps reading the next line, printing it and all the while measuring whether $stop-th pattern is seen. Once seen, we quickly quit.

How to print lines between same pattern with muliple occurrence?

4 Answers4

Explanation

Alternative AWK approach

Explanation

Assumptions:

Generalizing the approach