Олег Самойлов wrote: > \set table_size 1000000 > begin; > create table gender (gender varchar); > > insert into gender (gender) select case when random<0.50 then 'female' when random<0.99 then 'male' else 'other' end from (select random() as random, generate_series(1,:table_size)) as subselect; > > create index gender_btree on gender using btree (gender); > create index gender_hash on gender using hash (gender); > commit; > vacuum full analyze; > > Vacuum full is not necessary here, just a little vodoo programming. I expected that the hash index will be much smaller and quicker than the btree index, because it doesn’t keep values inside itself, only hashes. But: > > => \d+ > List of relations > Schema | Name | Type | Owner | Size | Description > --------+--------+-------+-------+-------+------------- > public | gender | table | olleg | 35 MB | > (1 row) > > => \di+ > List of relations > Schema | Name | Type | Owner | Table | Size | Description > --------+--------------+-------+-------+--------+-------+------------- > public | gender_btree | index | olleg | gender | 21 MB | > public | gender_hash | index | olleg | gender | 47 MB | > (2 rows) > > The hash index not only is more than the btree index, but also is bigger than the table itself. What is wrong with the hash index? I guess the problem here is that there are so few distinct values, so all the index items end up in only three hash buckets, forming large linked lists. I can't tell off-hand why that would make the index so large though. Anyway, indexes are pretty useless in such a case. Is the behavior the same if you have many distinct values? Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com