Search Postgresql Archives

Re: unicode match normal forms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17 May 2021 13:27:40 -0000
hamann.w@xxxxxxxxxxx wrote:
> in unicode letter ä exists in two versions - linux and windows use a
> composite whereas macos prefers the decomposed form. Is there any
> way to make a semi-exact match that accepts both variants?

Actually, re-reading your request, you want to *find* both forms?

In which case, a function index may be more useful::

  create index filename_unique_normalized
  on that_table(normalize(filename));

then you can search::

  select *
  from that_table
  where normalize(filename)=?

If you want to make sure that no two rows contain "equally-looking"
filenames, you can use a unique index::

  create unique index filename_unique_normalized
  on that_table(normalize(filename));

(while we're on the topic of "equally-looking" characters, you may
want to look at https://en.wikipedia.org/wiki/Homoglyph and
https://www.unicode.org/reports/tr36/ )

-- 
	Dakkar - <Mobilis in mobile>
	GPG public key fingerprint = A071 E618 DD2C 5901 9574
	                             6FE2 40EA 9883 7519 3F88
	                    key id = 0x75193F88







[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux