Search Postgresql Archives

RESOLVED: Re: Maximum document-size of text-search?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/22/2010 03:31 PM, Andreas Joseph Krogh wrote:
Hi.
I'm trying to index the contents of word-documents, extracted text, which leads to quite large documents sometimes. This resutls in the following Exception: Caused by: org.postgresql.util.PSQLException: ERROR: index row requires 10376 bytes, maximum size is 8191

I have the following schema:
andreak=# \d origo_search_index
                                       Table "public.origo_search_index"
Column | Type | Modifiers --------------------------+-------------------+----------------------------------------------------------------- id | integer | not null default nextval('origo_search_index_id_seq'::regclass)
 entity_id                | integer           | not null
 entity_type              | character varying | not null
 field                    | character varying | not null
 search_value             | character varying | not null
 textsearchable_index_col | tsvector          |

    "origo_search_index_fts_idx" gin (textsearchable_index_col)

Triggers:
update_search_index_tsvector_t BEFORE INSERT OR UPDATE ON origo_search_index FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger('textsearchable_index_col', 'pg_catalog.english', 'search_value')

I store all the text extracted from the documents in "search_value" and have the built-in trigger tsvector_update_trigger update the tsvector-column.

Any hints on how to get around this issue to allow indexing large documents? I don't see how "only index the first N bytes of the document" would be of interest to anyone...

BTW: I'm using PG-9.0beta3

Never mind... I was having a btree index on search_value too, which of course caused the problem.

--
Andreas Joseph Krogh<andreak@xxxxxxxxxxxx>
Senior Software Developer / CTO
------------------------+---------------------------------------------+
OfficeNet AS            | The most difficult thing in the world is to |
Rosenholmveien 25       | know how to do a thing and to watch         |
1414 Trollåsen          | somebody else doing it wrong, without       |
NORWAY                  | comment.                                    |
                        |                                             |
Tlf:    +47 24 15 38 90 |                                             |
Fax:    +47 24 15 38 91 |                                             |
Mobile: +47 909  56 963 |                                             |
------------------------+---------------------------------------------+


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux