Re: subtitle missing

Nicolas George <nicolas.george@xxxxxxxxxxxxxx> · Wed, 18 Feb 2009 22:36:36 +0100

Le decadi 30 pluviôse, an CCXVII, Mike Castle a écrit :
> There are a number of solutions that will extract the images, run them
> through OCR software, and then you manually proofread them for
> accuracy.  I have no idea how well they work for non-English though (I
> imagine they use spell correction to help in the OCR phase).

I have written one of these solutions, and it uses an ORC tool specialized
for subtitles, where glyphs are pixel-exact. It can be tricky to use,
especially when it comes to colon and semicolon and to correct a mistake,
but under good circumstances it can achieve a perfect extraction of the text
in a short time. It works well with non-ASCII languages, and has another
rare feature: it can recognize and keep italics.

You may want to give it a try. The source code is there:
http://gitorious.org/projects/exocr/repos/mainline

Regards,

-- 
  Nicolas George
Attachment:
signature.asc

Description: Digital signature
_______________________________________________
MPlayer-users mailing list
MPlayer-users@xxxxxxxxxxxx
https://lists.mplayerhq.hu/mailman/listinfo/mplayer-users