Le decadi 30 pluviôse, an CCXVII, Mike Castle a écrit : > There are a number of solutions that will extract the images, run them > through OCR software, and then you manually proofread them for > accuracy. I have no idea how well they work for non-English though (I > imagine they use spell correction to help in the OCR phase). I have written one of these solutions, and it uses an ORC tool specialized for subtitles, where glyphs are pixel-exact. It can be tricky to use, especially when it comes to colon and semicolon and to correct a mistake, but under good circumstances it can achieve a perfect extraction of the text in a short time. It works well with non-ASCII languages, and has another rare feature: it can recognize and keep italics. You may want to give it a try. The source code is there: http://gitorious.org/projects/exocr/repos/mainline Regards, -- Nicolas George
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ MPlayer-users mailing list MPlayer-users@xxxxxxxxxxxx https://lists.mplayerhq.hu/mailman/listinfo/mplayer-users