On 19 February 2016 at 14:19, Bruce Momjian <bruce@xxxxxxxxxx> wrote:
On Fri, Feb 19, 2016 at 11:53:26AM +0000, Simon Riggs wrote:
> On 19 February 2016 at 11:46, Thomas Kellerer <spam_eater@xxxxxxx> wrote:
>
> Daniel Westermann schrieb am 19.02.2016 um 12:41:
> >>>> if I'd need to implement/replace Oracle Text (ww.oracle.com/
> technetwork/testcontent/index-098492.html).
> >>>>> What choices do I have in PostgreSQL (9.5+) ?
> >
> >>Postgres also has a full text search (which I find much easier to use
> than Oracle's):
> >>
> >>http://www.postgresql.org/docs/current/static/textsearch.html
> >
> > Yes, i have seen this. Can this be used to index and search binary
> documents, e.g. pdf ?
>
> Ah, no. That's not possible
>
>
> ...not possible, Yet.
>
> PostgreSQL grows by adding the features people need and its changing rapidly.
I wonder if PLPerl could be used to extract the words from a PDF
document and create a tsvector column from it.
I don't know about PLPerl(I'm pretty sure it could be used for this purpose, though.). On the other hand I've written code for this in Python which should be easy to adapt for PLPython, if necessary.