Hi Paul, On Fri, 27 Mar 2020 14:43:01 -0700 Linux for blind general discussion <blinux-list@xxxxxxxxxx> wrote: > > I don't understand how paragraphs start and end in these files. Otherwise > > you > > can try using one of the text processing tools mentioned here: > > > > * https://www.shlomifish.org/open-source/resources/text-processing-tools/ > > > > * https://www.computerhope.com/unix/ufold.htm > > > > * https://en.wikipedia.org/wiki/Fmt_(Unix) > > > > * https://en.wikipedia.org/wiki/Par_(command) > > > > Note that you may have better luck converting EPUBs (assuming they lack > > https://en.wikipedia.org/wiki/Digital_rights_management ) to plaintext using > > tools such as https://pandoc.org/ , > > https://metacpan.org/search?q=html%3A%3Awikiconverter&size=20 , etc. > > Of that list of programs, I'd be inclined to use Pandoc. It permits > you to write filters in (embedded) Lua, which is a quick-to-learn > programming language. For example, this Lua one-liner converts a > string ("s") to add a line break after each existing line break: > > s = string.gsub(s, "<BR>", "<BR>\n<BR>") > Other tools may work as well. Furthermore, your HTML processing substitution will not work if one has "<br>" or "<br />" or "<br/>" for newlines or uses the more recommended https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p element. Also see: * https://perl-begin.org/uses/text-parsing/ * https://blog.codinghorror.com/parsing-html-the-cthulhu-way/ > On writing Pandoc filters with Lua, see <https://pandoc.org/lua-filters.html>. > > Best regards, > > Paul > -- Shlomi Fish https://www.shlomifish.org/ https://is.gd/MQHVF3 - The Atom Text Editor edits a 2,000,001B file Joel’s Generalisation: If it happens to you, it happens to everybody. (Or: It’s never only you.) — Based on http://www.joelonsoftware.com/news/20020402.html Please reply to list if it's a mailing list post - http://shlom.in/reply . _______________________________________________ Blinux-list mailing list Blinux-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/blinux-list