Re: proxy_html / xml2enc won't handle certain HTML entities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nick,

Your glass of wine was inspiring: just removed

>        ProxyHTMLCharsetOut     *   # Backend (Tomcat) charset is ISO-8859-1

and the problem's gone!

Also commented out 

>        ProxyHTMLMeta           on

with no noticeable change in behaviour. As per the docs "turning ProxyHTMLMeta Off will give a small performance boost", so off it goes.

Thank you so much!

FYI, by increasing LogLevel to INFO, error log shows:

        [Fri May 08 07:42:35.790051 2020] [xml2enc:info] [pid 13183:tid 139823008806656] [client _redacted_:55344] AH01431: Got charset ISO-8859-1 from HTTP headers

So our backend's stated charset is ISO-8859-1. 

About your questions:

> Are you sure your backend is sending literally those entities, as opposed to their byte representations in its charset?
> Note that libxml2 is doing the hard work here: what version of libxml2 do you have?

"Faulty" entities are coded verbatim (i.e. "→") in the backend JSP pages, and are rendered exactly that way in non-proxied responses. libxml2 version is 2.9.4 (within Debian 10.3 amd64).

I can do further testing, if you need it.

FYI 2 (side point):

>        <Location "/">
>                ProxyHTMLURLMap                 "/backend-path/(.*)" "/$1" R

We had some previous experience with proxy URL mapping, and "/frontend-path/" <-> "/backend-path/" has always worked fine for us without the regexp. But mapping the root frontend path "/" gave us some trouble; maybe there's a better solution, but that regexp solved the issue.

Thank you again. Best regards,

Antonio

----- Mensaje original -----
De: "Nick Kew" <niq@xxxxxxxxxx>
Para: "users" <users@xxxxxxxxxxxxxxxx>
Enviados: Viernes, 8 de Mayo 2020 1:49:25
Asunto: Re:  proxy_html / xml2enc won't handle certain HTML entities

> On 7 May 2020, at 17:52, Antonio Suárez Pozuelo <a.suarez@xxxxxxxxxxxxxxx> wrote:
> 
> Hi there,

Further to my last reply, I can see what may possibly be wrong:

> We have a Tomcat 8 backend server behind an Apache 2.4 proxy. Our Apache conf:
> 
>        ProxyPreserveHost       on
>        ProxyHTMLEnable         on
>        ProxyHTMLExtended       on

You probably don't want that.

>        ProxyHTMLCharsetOut     *   # Backend (Tomcat) charset is ISO-8859-1

I suspect that is very probably the culprit.
Does removing it fix the problem?


>        ProxyHTMLMeta           on

You probably also don't want that.  I think the documentation of that
is misleadingly out-of-date, but I don't want to check now (late, and
after a glass of wine).

-- 
Nick Kew


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx





[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux