Re: how to extract subtitles in text format?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nick Rolfe wrote:
I have bought a DVB-T card and am able to use it with Kaffeine with no
problem. I can record the video, with the subtitles. However, I would
like to know if it is possible to:
- extract these subtitles as text (subtitleripper?)
- get a timestamp in seconds for each line of subtitle, as text too.

Have a look at son2srt <http://www.cs.helsinki.fi/u/mikkila/son2srt/>,
there's a better explanation than I could give there.

(Incidentally, the author talks about taking input from a subtitle
file created by ProjectX. ProjectX is the only tool I've found to
reliably result in synchronised audio/video when used in the
transcoding process - it copes much better with stream errors).

A friend of mine has modified it and built up a custom symbol database
for the font they use in UK DVB-T broadcasts, but it's still a WIP
(and he's on holiday right now). It works pretty reliably for most
text but often gets caught out by punctuation.

So yes, it is possible, but atm it may require quite a bit of work on
your part to get it working.

-Nick
Thanks a lot for all these answers. Actually, gocr seems to work very well with BBC subtitles.

Eric

_______________________________________________

linux-dvb@xxxxxxxxxxx
http://www.linuxtv.org/cgi-bin/mailman/listinfo/linux-dvb

[Index of Archives]     [Linux Media]     [Video 4 Linux]     [Asterisk]     [Samba]     [Xorg]     [Xfree86]     [Linux USB]

  Powered by Linux