https://bugzilla.redhat.com/show_bug.cgi?id=2244406 Bug ID: 2244406 Summary: Review Request: python-RTFDE - A library for extracting HTML content from RTF encapsulated HTML Product: Fedora Version: rawhide Hardware: All OS: Linux Status: NEW Component: Package Review Severity: medium Priority: medium Assignee: nobody@xxxxxxxxxxxxxxxxx Reporter: gui1ty@xxxxxxxxxxxxx QA Contact: extras-qa@xxxxxxxxxxxxxxxxx CC: package-review@xxxxxxxxxxxxxxxxxxxxxxx Target Milestone: --- Classification: Fedora Spec URL: https://download.copr.fedorainfracloud.org/results/gui1ty/extract-msg/fedora-rawhide-x86_64/06529620-python-RTFDE/python-RTFDE.spec SRPM URL: https://download.copr.fedorainfracloud.org/results/gui1ty/extract-msg/fedora-rawhide-x86_64/06529620-python-RTFDE/python-RTFDE-0.1.0-1.20231015git66780b8.fc40.src.rpm Description: RTFDE: RTF De-Encapsulator A python3 library for extracting encapsulated HTML & plain text content from the RTF bodies of .msg files. De-encapsulation enables previously encapsulated HTML and plain text content to be extracted and rendered as HTML and plain text instead of the encapsulating RTF content. After de-encapsulation, the HTML and plain text should differ only minimally from the original HTML or plain text content. Features - De-encapsulate HTML from RTF encapsulated HTML - De-encapsulate plain text from RTF encapsulated text Known Issues - This library _fully_ unquotes text it de-encapsulates because it does not know which text was quoted in the RTF conversion process and which text was quoted in the original html/text. So, for instance escaped Quoted-Printable text will be returned un-escaped. - This library currently can't combine attachments from a .MSG Message object with the de-encapsulated HTML. This is mostly because I could not get a good set of examples of encapsulated HTML which had attachment objects that needed to be integrated back into the body of the HTML. Anti-Features (I don't intend to have this library do this.) - Extract plain text from RTF encapsulated HTML. If you want this, then you will have to parse the HTML using another library. Fedora Account System Username: gui1ty Copr Build: https://copr.fedorainfracloud.org/coprs/gui1ty/extract-msg/build/6529620/ -- You are receiving this mail because: You are on the CC list for the bug. You are always notified about changes to this product and component https://bugzilla.redhat.com/show_bug.cgi?id=2244406 Report this comment as SPAM: https://bugzilla.redhat.com/enter_bug.cgi?product=Bugzilla&format=report-spam&short_desc=Report%20of%20Bug%202244406%23c0 _______________________________________________ package-review mailing list -- package-review@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to package-review-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/package-review@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue