proxy_html / xml2enc won't handle certain HTML entities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

We have a Tomcat 8 backend server behind an Apache 2.4 proxy. Our Apache conf:

        ProxyPreserveHost       on
        ProxyHTMLEnable         on
        ProxyHTMLExtended       on
        ProxyHTMLCharsetOut     *   # Backend (Tomcat) charset is ISO-8859-1
        ProxyHTMLMeta           on
        ProxyHTMLDocType        "<!DOCTYPE html>" XML

        <Location "/">
                ProxyHTMLURLMap                 "/backend-path/(.*)" "/$1" R
                ProxyPass                       "http://backend-host:8080/backend-path/";
                ProxyPassReverse                "http://backend-host:8080/backend-path/";
                ProxyPassReverseCookieDomain    backend-host "%{HTTP_HOST}s"
                ProxyPassReverseCookiePath      "/backend-path/" "/"
        </Location>

Everything works fine but for a few HTML entities; detected so far: &rarr; &larr; &uarr; &darr; &#x025B8;. Whenever the backend response HTML includes one of those:

1. Apache's response's erratic: it either drops parts of the HTML or resets the connection altogether.

2. Error log shows:

        [Thu May 07 18:07:54.934922 2020] [xml2enc:error] [pid 12355:tid 139930604844800] [client (_redacted_):33206] AH01444: Skipping invalid byte(s) in input stream!, referer: (_redacted_)

First experienced on version 2.4.38 (Debian-shipped); also verified on version 2.4.43 (just built from source on Debian 10.3 amd64).

As far as I know, those "faulty" HTML entities are fully standard. Some others such as &nbsp; or letters with diacritics (&ntilde;, &aacute;...) pass through just fine.

By the way, we've found that replacing 

        ProxyHTMLEnable         on

with

        SetOutputFilter        proxy-html

works fine for those HTML entities (although it has some drawbacks with non-english characters, so it's of no use for us).

Are we doing something wrong, maybe?

Thanks you all in advance. Best regards,

Antonio

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx




[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux