Search Postgresql Archives

No Greek stop words in FTS ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I maintain a project (diofanti.org) that tracks public spending in Greece.
It’s a PG instance hosting 55M+ json documents with searching functionality on top of them.

It relies heavily on to_tsvector(‘greek’, ..), as users search for company names, invoice descriptions etc. 

The results are fairly good, but as I was trying to experiment with adding some more domain-specific stopwords, I realised there’s no greek.stop under $(pg_config —sharedir)/tsearch_data
And indeed looks like stop words are maintained with to_tsvector(‘greek’, ..). 

select to_tsvector('greek', 'ΚΑΛΗΜΕΡΑ ΚΑΙ ΣΕ ΕΣΑΣ'); --> 'εσ':4 'κα':2 'καλημερ':1 'σε':3 
select to_tsvector('english', 'AND GOOD MORNING TO YOU TOO'); --> 'good':2 'morn’:3

I found an older discussion on pgsql-hackers [0] but not sure where this stopped / if started ? 

Am I missing something? 
Is there another thread/patch I can peek up myself ?




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux