0

I'm trying to extract lines from a subtitle (.srt) file. When I grep for a specific line number, I get the answer I expect:

% grep -e "^817" ponyo.srt                                             
817
%

but when I try to grep for that line including a carriage return (or a carriage return and EOL), I get a blank line:

% grep -e "^817\r" ponyo.srt 

% grep -e "^817^M$" ponyo.srt

%

Here's the text file using "cat -e" to show hidden characters:

% cat -e ponyo.srt 
1^M$
00:04:38,478 --> 00:04:43,381^M$
The Beginning^M$
^M$
2^M$
00:04:44,751 --> 00:04:51,122^M$
PONYO ON THE CLIFF BY THE SEA^M$
^M$
474^M$
01:00:23,016 --> 01:00:25,041^M$
Stay here with Ponyo.^M$
^M$
475^M$
01:00:25,285 --> 01:00:28,618^M$
I'm going too.^M$
Let's take Ponyo with us.^M$
^M$
817^M$
01:40:08,532 --> 01:40:13,834^M$
<i>Oh he 's my favorite little boy</i>^M$
^M$
823^M$
01:40:32,456 --> 01:40:38,156^M$
Studio Ghibli^M$
^M$
824^M$
01:40:39,530 --> 01:40:42,624^M$
The End^M$
^M$
825^M$
01:40:42,766 --> 01:40:45,792^M$
English translation by^M$
Jim Hubbert and Rieko Izutsu-Vajirasarn^M$
English subtitles by^M$
Aura^M$
^M$
%

How can I grep for the end of lines and get the whole line in the results?

EDIT: To add, just searching for EOL returns nothing, as I would expect:

% grep -e "^817$" ponyo.srt 
%
  • https://duckduckgo.com/?q=grep+regex&ia=web – jsotola Feb 02 '22 at 01:41
  • https://unix.stackexchange.com/questions/124462/detecting-pattern-at-the-end-of-a-line-with-grep#124463 – jsotola Feb 02 '22 at 01:43
  • @jsotola Neither of those links appear to address my concern. Can you elaborate? – Johnny Rollerfeet Feb 02 '22 at 01:48
  • It seems like you probably just want to install dos2unix and run `mac2unix` on the file before attempting to work with it. – jesse_b Feb 02 '22 at 01:49
  • 1
    Do you have `grep` aliased to something that includes a `--color=` specification? – steeldriver Feb 02 '22 at 01:52
  • @steeldriver Yes. "alias grep='grep --color=auto'" I'll remove that and try again. – Johnny Rollerfeet Feb 02 '22 at 01:59
  • ... see for example [grep --color=auto breaks when ^M is inside colored match](https://unix.stackexchange.com/questions/350352/grep-color-auto-breaks-when-m-is-inside-colored-match) – steeldriver Feb 02 '22 at 01:59
  • 1
    @steeldriver Yep, that was the problem. Do you want to answer and get credit? (Or I can answer myself and hog all the fake internet points for myself.) – Johnny Rollerfeet Feb 02 '22 at 02:01
  • @JohnnyRollerfeet ... first link was to use regex to match a whole line ... second link indicates that `$` matches to end of line, no need for trying to match `\r` or `\n` – jsotola Feb 02 '22 at 02:01
  • @jsotola Using just $ doesn't return a result because there is a character (^M) between the number and the EOL. – Johnny Rollerfeet Feb 02 '22 at 02:03
  • 5
    Does this answer your question? [grep --color=auto breaks when ^M is inside colored match](https://unix.stackexchange.com/questions/350352/grep-color-auto-breaks-when-m-is-inside-colored-match) – steeldriver Feb 02 '22 at 12:35
  • The native EOL (end-of-line) character in UNIX is LF `\n`, not CR `\r`, so matching `$` may not be what you expect. – U. Windl Feb 08 '22 at 14:10

2 Answers2

3

(Credit to @steeldriver for the answer.)

As mentioned in this post grep --color changes the way grep writes responses to the terminal in a way that interferes with writing ^M to the screen. I was using alias grep='grep --color=auto' in my bash settings. To bypass the alias, both to troubleshoot this issue and to create a permanent solution, I used \ to have the shell use grep without the alias.

% \grep -e "^817^M" ponyo.srt
817
% 
0

Two things appear to be causing a hangup - first, you're using double quotes, which means the shell will interpret some things rather than passing them all verbatim. Secondly, you may need to escape that dollar sign with a \ since even with single quotes, my shell won't properly search for one. With those parameters changed, grep will in fact return the lines you're looking for (or at least it seems that way to me).

re-cursion
  • 451
  • 2
  • 6