On 12.02.2016 at 13:30, Ashley Sheridan wrote: > I've noticed that the santisation filter FILTER_SANITIZE_URL is not working quite as the documentation suggests. > > Particularly, this filter says it removes all characters except letters, digits, and a small list of specific characters. However, I took letters in this context to be the same as \p{L} that the preg_* functions support, but it appears it's actually only meaning [a-zA-Z] here. I need characters like êéö to not be stripped (these are valid in URLs and have been widely supported in browsers and servers for years) > > Is utf8 support on this filter intentionally missing, or is there a flag I need to set in order for it to work correctly. Well, UTF-8 support is neither intentionally missing, nor is there a flag to change the behavior: UTF-8 support is simply not implemented (yet). See also the definition of allowed_list[1], and the definition of LOWALPHA and HIALPHA[2]. Consider to file a feature request, but please double-check if there's not already a respective one. I guess, there is. [1] <https://github.com/php/php-src/blob/php-7.0.3/ext/filter/sanitizing_filters.c#L324> [2] <https://github.com/php/php-src/blob/php-7.0.3/ext/filter/sanitizing_filters.c#L60> -- Christoph M. Becker -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php