On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller <sfkeller@xxxxxxxxx> wrote: > Merlin and Jeff, > > General remark again:It's hard for me to imagine that btree is > superior for all the issues mentioned before. I still believe in hash > index for primary keys and certain unique constraints where you need > equality search and don't need ordering or range search. It is -- but please understand I'm talking about int32 tree vs hash. Hashing as a technique of course is absolutely going to cream btree for all kinds of data because of the advantages of working with decomposed data -- we are all taught that in comp-sci 101 :-). The debate here is not about the advantages of hashing, but the specific implementation of the hash index used. Postgres's hash index implementation used to be pretty horrible -- it stored the pre-hashed datum in the index which, while making it easier to do certain things, made it horribly slow, and, for all intents and purposes, useless. Somewhat recently,a lot of work was put in to fix that -- the index now packs the hash code only which made it competitive with btree and superior for larger keys. However, certain technical limitations like lack of WAL logging and uniqueness hold hash indexing back from being used like it really should be. In cases where I really *do* need hash indexing, I do it in userland. create table foo ( a_long_field text; ); create index on foo(hash(a_long_field)); select * from foo where hash(a_long_field) = hash(some_value) and a_long_field = some_value; This technique works fine -- the main disadvantage is that enforcing uniqueness is a PITA but since the standard index doesn't support it either it's no great loss. I also have the option of getting 'uniqueness' and being able to skip the equality operation if I sacrifice some performance and choose a strong digest. Until the hash index issues are worked out, I submit that this remains the go-to method to do this. Now, my point here is that I've noticed that even with the latest optimizations btree seems to still be superior to the hash indexing by most metrics, so that: create table foo ( an_int_field int; a_long_field text; ); create index on foo(an_int_field); create index on foo using hash(a_long_field); On performance grounds alone, the btree index seems to be (from my very limited testing) a better bet. So I'm conjecturing that the current hash implementation should be replaced with a formalization of the userland technique shown above -- when you request a hash index, the database will silently hash your field and weave it into a btree. It's a hybrid: a hashed btree. To truly demonstrate if the technique was effective though, it would have to be coded up -- it's only fair to compare if the btree based hash is also double checking the value in the heap which the standard hash index must do. The other way to go of course is to try and fix up the existing hash index code -- add wal logging, etc. In theory, a customized hash structure should be able to beat btree all day long which argues to continue in this direction. @ jeff: >The way you created the table, I think the rows are basically going to be in order in the table, which means the btree index accesses are going to visit the same block over and over again before going to the next block. This does not explain the behavior. Yeah -- it may take longer but your computer should not be sitting idle during create index operations :-). Unfortunately, I was not able to reproduce it on linux. I have to bite the bullet and get the mingw up if I want to try and diagnose -- perhaps it is stalling in the semop calls. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance