Search Postgresql Archives

Re: tsearch2 and pdf files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



1. Convert PDF to file with e.g xpdf
2. Insert parsed text to a table of your choice.
3. Make vectors from the text.

Cheers,


11 dec 2006 kl. 18:23 skrev Philip Johnson:

Do you know what kind of table should I use ?
Is there a shell script or a php script that does the work ?

regards

-----Message d'origine-----
De : pgsql-general-owner@xxxxxxxxxxxxxx [mailto:pgsql-general-
owner@xxxxxxxxxxxxxx] De la part de Hannes Dorbath
Envoyé : lundi 11 décembre 2006 12:21
À : pgsql-general@xxxxxxxxxxxxxx
Objet : Re: [GENERAL] tsearch2 and pdf files

You just need software that extracts the text from it. Search google for pdf2txt and others. Printer drivers that try to get text from anything
are available as well.


On 11.12.2006 11:41, Philip Johnson wrote:
I'm using Postgresql 8.1.5

Tsearch2 is installed and runs well

I'd like to use tsearch2 to index PDF files.

Do someone has a detailed process to implement that?


--
Regards,
Hannes Dorbath

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org/



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux