[Bug 2244406] New: Review Request: python-RTFDE - A library for extracting HTML content from RTF encapsulated HTML

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.redhat.com/show_bug.cgi?id=2244406

            Bug ID: 2244406
           Summary: Review Request: python-RTFDE - A library for
                    extracting HTML content from RTF encapsulated HTML
           Product: Fedora
           Version: rawhide
          Hardware: All
                OS: Linux
            Status: NEW
         Component: Package Review
          Severity: medium
          Priority: medium
          Assignee: nobody@xxxxxxxxxxxxxxxxx
          Reporter: gui1ty@xxxxxxxxxxxxx
        QA Contact: extras-qa@xxxxxxxxxxxxxxxxx
                CC: package-review@xxxxxxxxxxxxxxxxxxxxxxx
  Target Milestone: ---
    Classification: Fedora



Spec URL:
https://download.copr.fedorainfracloud.org/results/gui1ty/extract-msg/fedora-rawhide-x86_64/06529620-python-RTFDE/python-RTFDE.spec
SRPM URL:
https://download.copr.fedorainfracloud.org/results/gui1ty/extract-msg/fedora-rawhide-x86_64/06529620-python-RTFDE/python-RTFDE-0.1.0-1.20231015git66780b8.fc40.src.rpm

Description:
RTFDE: RTF De-Encapsulator

A python3 library for extracting encapsulated HTML & plain text content
from the RTF bodies of .msg files.

De-encapsulation enables previously encapsulated HTML and plain text
content to be extracted and rendered as HTML and plain text instead of
the encapsulating RTF content. After de-encapsulation, the HTML and
plain text should differ only minimally from the original HTML or plain
text content.

Features

 - De-encapsulate HTML from RTF encapsulated HTML
 - De-encapsulate plain text from RTF encapsulated text

Known Issues

 - This library _fully_ unquotes text it de-encapsulates because it does
 not know which text was quoted in the RTF conversion process and which
 text was quoted in the original html/text. So, for instance escaped
 Quoted-Printable text will be returned un-escaped.
 - This library currently can't combine attachments from a .MSG Message
 object with the de-encapsulated HTML. This is mostly because I could
 not get a good set of examples of encapsulated HTML which had
 attachment objects that needed to be integrated back into the body of
 the HTML.

Anti-Features (I don't intend to have this library do this.)

 - Extract plain text from RTF encapsulated HTML. If you want this,
 then you will have to parse the HTML using another library.

Fedora Account System Username: gui1ty

Copr Build:
https://copr.fedorainfracloud.org/coprs/gui1ty/extract-msg/build/6529620/


-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are always notified about changes to this product and component
https://bugzilla.redhat.com/show_bug.cgi?id=2244406

Report this comment as SPAM: https://bugzilla.redhat.com/enter_bug.cgi?product=Bugzilla&format=report-spam&short_desc=Report%20of%20Bug%202244406%23c0
_______________________________________________
package-review mailing list -- package-review@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to package-review-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/package-review@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite Conditions]     [KDE Users]

  Powered by Linux