Hi, Peter,
How to define word boundary as either by using
^ , space, or $
So that the following can be done
fox fox is a repeat
foxfox is not a repeat but just one word.
Regards,
David
On Thu, 9 Dec 2021 at 13:35, Peter J. Holzer <hjp-pgsql@xxxxxx> wrote:
On 2021-12-09 12:38:15 +0000, Shaozhong SHI wrote:
> Does anyone know how to detect repeated phrase in a string?
Use regular expressions with backreferences:
bayes=> select regexp_match('foo wikiwiki bar', '(.+)\1');
╔══════════════╗
║ regexp_match ║
╟──────────────╢
║ {o} ║
╚══════════════╝
(1 row)
"o" is repeated in "foo".
bayes=> select regexp_match('fo wikiwiki bar', '(.+)\1');
╔══════════════╗
║ regexp_match ║
╟──────────────╢
║ {wiki} ║
╚══════════════╝
(1 row)
"wiki" is repeated in "wikiwiki".
bayes=> select regexp_match('fo wikiwi bar', '(.+)\1');
╔══════════════╗
║ regexp_match ║
╟──────────────╢
║ (∅) ║
╚══════════════╝
(1 row)
nothing is repeated.
Adjust the _expression_ within parentheses if you want to match somethig
more specific than any sequence of one or more characters.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@xxxxxx | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"