Given a 425M sized text file with the following content:
--START--
Data=asdfasdf
Device=B
Lorem=Ipsum
--END--
--START--
Data=asdfasdf
Lorem=Ipsum
Device=A
--END--
--START--
Device=B
Data=asdfasdf
--END--
...
The sed task is to print everything between --START-- and --END--, where Device=A is included. There are two solutions provided here and here. There is huge execution time difference between both commands. The second command is quite faster, but needs more description for me how it works?
$ sed -n '/--START--/{:a;N;/--END--/!ba; /Device=A/p}' file
$ sed 'H;/--START--/h;/--END--/!d;x;/Device=A/!d' file
The description of the first command:
How it works:
/--START--/{...}Every time we reach a line that contains--START--, run the commands inside the braces{...}.
:a;Define a label "a".
N;Read the next line and add it to the pattern space.
/--END--/!baUnless the pattern space now contains--END--, jump back to labela.
/Device=A/pIf we get here, that means that the patterns space starts with--START--and ends with--END--. If, in addition, the pattern space containsDevice=A, then print (p) it.
Description of 2nd command:
sed 'H #add line to hold space /--START--/h #put START into hold space (substitute holded in) /--END--/!d #clean pattern space (start next line) if not END x #put hold space into pattern space /Device=A/!d #clean pattern space if it have not "Device=A" ' file