How about split up the value into individual words and keep their orders?
add words up to form individual phrase and ensure that each phrase only consists unique/distinct words
count repeated phrases afterward
How about this?
Regards,
David
On Tue, 25 Jan 2022 at 17:22, Karsten Hilbert <Karsten.Hilbert@xxxxxxx> wrote:
> There is a short of a function in the standard Postgres to do the following:
>
> it is easy to count the number of occurrence of words, but it is rather difficult to count the number of occurrence of phrases.
>
> For instance:
>
> A cell of value: 'Hello World' means 1 occurrence a phrase.
>
> A cell of value: 'Hello World World Hello' means no occurrence of any repeated phrase.
>
> But, A cell of value: 'Hello World World Hello Hello World' means 2 occurrences of 'Hello World'.
>
> 'The City of London, London' also has no occurrences of any repeated phrase.
>
> Anyone has got such a function to check out the number of occurrence of any repeated phrases?
For that to become answerable you may want to define what to
do when facing ambiguity.
Best,
Karsten