Re: Utf8 issues with FILTER_SANITIZE_URL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12.02.2016 at 13:30, Ashley Sheridan wrote:

> I've noticed that the santisation filter FILTER_SANITIZE_URL is not working quite as the documentation suggests.
> 
> Particularly, this filter says it removes all characters except letters, digits, and a small list of specific characters. However, I took letters in this context to be the same as \p{L} that the preg_* functions support, but it appears it's actually only meaning [a-zA-Z] here. I need characters like êéö to not be stripped (these are valid in URLs and have been widely supported in browsers and servers for years)
> 
> Is utf8 support on this filter intentionally missing, or is there a flag I need to set in order for it to work correctly.

Well, UTF-8 support is neither intentionally missing, nor is there a
flag to change the behavior: UTF-8 support is simply not implemented
(yet).  See also the definition of allowed_list[1], and the definition
of LOWALPHA and HIALPHA[2].

Consider to file a feature request, but please double-check if there's
not already a respective one.  I guess, there is.

[1]
<https://github.com/php/php-src/blob/php-7.0.3/ext/filter/sanitizing_filters.c#L324>
[2]
<https://github.com/php/php-src/blob/php-7.0.3/ext/filter/sanitizing_filters.c#L60>

-- 
Christoph M. Becker


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux