Re: proxy_html / xml2enc won't handle certain HTML entities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On 7 May 2020, at 17:52, Antonio Suárez Pozuelo <a.suarez@xxxxxxxxxxxxxxx> wrote:
> 
> Hi there,


>        <Location "/">
>                ProxyHTMLURLMap                 "/backend-path/(.*)" "/$1" R

Minor point, no need for that regexp.  Just   /backend-path/  /
(the remainder will be untouched).

> Everything works fine but for a few HTML entities; detected so far: &rarr; &larr; &uarr; &darr; &#x025B8;. Whenever the backend response HTML includes one of those:

Are you sure your backend is sending literally those entities, as opposed to their byte
representations in its charset?

> 1. Apache's response's erratic: it either drops parts of the HTML or resets the connection altogether.
> 
> 2. Error log shows:
> 
>        [Thu May 07 18:07:54.934922 2020] [xml2enc:error] [pid 12355:tid 139930604844800] [client (_redacted_):33206] AH01444: Skipping invalid byte(s) in input stream!, referer: (_redacted_)

That means there's something in the input stream that's a mismatch with the charset
detected.  Either that or you've found a bug.  Note that libxml2 is doing the hard work
here: what version of libxml2 do you have?

And is there any other filter involved?

> 
> By the way, we've found that replacing 
> 
>        ProxyHTMLEnable         on
> 
> with
> 
>        SetOutputFilter        proxy-html]

Yes, that configuration skips mod_xml2enc entirely, which means you have no i18n
support in mod_proxy_html.  So

>     (although it has some drawbacks with non-english characters, so it's of no use for us).

is expected behaviour.

> Are we doing something wrong, maybe?

Nothing obviously wrong, though your backend/app may be misconfigured.

If you increase LogLevel to INFO, mod_xml2enc will report what charset it detects,
so you can check whether that's what you think it should be.  Probably better to
increase it to DEBUG, which will get quite a lot more info from mod_xml2enc.

If that doesn't help you figure it out, post the mod_xml2enc messages at level
DEBUG here and I'll take a look.

-- 
Nick Kew
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx





[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux