6

I want to use the command line to extract subtitles from video files.

I want to extract subtitles from a lot of files. That is why I want a CLI tool.

Ideally it should work with any video format that supports embedded subtitles.

For example:

subextract -f RevolutionOS.mp4

Extracting English.srt
Extracting French.srt
Extracting Russina.srt
All subtitles extracted
Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
Wally
  • 297
  • 2
  • 11
  • 2
    What kind of video files? How are the subtitles encoded? Are they hard coded? Please [edit] your question and give more details. I'm pretty sure the answer is that you can't if the subs are hard coded in the video file though. – terdon Apr 02 '16 at 12:33
  • Your question is vague but [VLC](http://www.videolan.org/vlc/index.html) may help you. – Luc M Apr 02 '16 at 12:46
  • @terdon Have updated the question. I mean embeeded subtitles. Not Hardcoded ones. – Wally Apr 02 '16 at 13:56
  • Is there any difference between embedded and hardcoded? Are you sure this is even possible? – terdon Apr 02 '16 at 13:58
  • 1
    I would imagine that hard-coded subtitles are edited onto the video itself with the text overlaid on top of the video, while embedded subtitles are text files (in one of the common subtitle formats) embedded in the container file (.mp4, .mkv, etc). Extracting the embedded subtitles should be possible - every video player that supports subtitles manages to do it in order to display them. googling `extract subtitle from container` gets about 350,000 results including http://superuser.com/questions/391892/extract-subtitles-from-movie which mentions https://gpac.wp.mines-telecom.fr/mp4box/ – cas Apr 02 '16 at 22:54
  • 1
    (if anyone wants to "steal" my comment and turn it into a real answer, i'll happily upvote it. don't have time myself, or enough interest/knowledge in the subject. anyway, comments are always fair game for stealing into answers). – cas Apr 02 '16 at 22:55

1 Answers1

5

There are such tools, specific to each container type (assuming subtitles are stored as text, not mixed in the video stream):

For your question specifically, the command line would be

MP4Box -srt <trackID> RevolutionOS.mp4

Where possible values of trackID can be deduced from the output of

MP4Box -info RevolutionOS.mp4

For subtitles which are mixed into the video stream (so-called hardsubs), OCR software is required. There seem to be ready-made solutions here, for example subtitleripper + GOCR for VobSub (common format for DVD), but I have no experience with those and no idea how good they are.

Dmitry Grigoryev
  • 7,123
  • 2
  • 23
  • 62