Re: Values larger than 1/3 of a buffer page cannot be indexed (hstore)

Stefan Keller <sfkeller@xxxxxxxxx> · Sun, 22 May 2011 15:21:45 +0200

Hi Tom, hi all

Thanks, Tom, for your tipps. You answered 2011/5/1:
> (...), and there's no point in having the
> index column contents be the entire tags value (which is what's leading
> to the failure).  Consider
>
> create index planet_osm_point_amenity on planet_osm_point ((tags->amenity));

To get a more general purpose index I tried also:

CREATE INDEX planet_osm_point_tags ON planet_osm_point USING gist(tags);
-- ERROR: invalid hstore value found
-- SQL state: XX000

And I'm really interested in being able to use GIST. GIST is also
recommended here:
http://www.bostongis.com/PrinterFriendly.aspx?content_name=loading_osm_postgis

But I still get error 'invalid hstore value found' on my machine -
whatever I do!

I tried hard to find out the reason and also to remedy the cause of
this failure.
For example I did an update like this: UPDATE planet_osm_point SET
tags = hstore(hstore_to_array(tags));
...with no success.

I'm running "PostgreSQL 9.1alpha1, compiled by Visual C++ build 1500,
32-bit" on Windows XP SP3.
I have a dump of the table/database at hand to anyone who is
interested in this possible bug.

Yours, Stefan

2011/5/1 Tom Lane <tgl@xxxxxxxxxxxxx>:
> Stefan Keller <sfkeller@xxxxxxxxx> writes:
>> I'm doing an equality search with success with the '->' operator on
>> the same field 'tags' like in this query (1):
>
>> -- Count all restaurants in database ("amenity = restaurant''):
>> select count(*) from osm_all_v
>> where hstore(tags)->'amenity'='restaurant'
>
>> This query 1 is reasonably fast and could be accelerated using this
>> functional index:
>> CREATE INDEX planet_osm_point_tags_restaurant
>> Â ON planet_osm_point
>> Â USING btree (tags)
>> Â WHERE (tags -> 'amenity'::text) = 'restaurant'::text;
>
> This index seems a bit carelessly defined. ÂThere's no need to confine
> its usefulness to exactly that query, and there's no point in having the
> index column contents be the entire tags value (which is what's leading
> to the failure). ÂConsider
>
> create index planet_osm_point_amenity on planet_osm_point ((tags->amenity));
>
> which will work for the above query and any other that's looking for a
> specific value of tags->amenity.
>
>> SELECT MIN(keys.key), hstore(p.tags)->keys.key, count(*)
>> FROM planet_osm_point p, (
>> Â SELECT key, count(*)
>> Â FROM (
>> Â Â SELECT (each(tags)).key FROM planet_osm_point
>> Â ) AS stat
>> GROUP BY key HAVING count(*) > 1
>> Â AND key NOT LIKE 'note:%'
>> Â ...
>> Â AND key NOT IN
>> ('ele','ref','url','website','email','maxspeed','converted_by', ... )
>> ) keys
>> WHERE hstore(p.tags)->keys.key >''
>> GROUP BY hstore(p.tags)->keys.key HAVING count(*) > 1
>> ORDER by 1,3 desc
>
> It's pretty much useless to think about indexes for queries like this.
> If it's going to scan the whole table anyway, as this surely is, then
> an index is not going to make it faster.
>
> Â Â Â Â Â Â Â Â Â Â Â Âregards, tom lane
>

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general