Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

Jeff Janes <jeff.janes@xxxxxxxxx> · Mon, 19 Sep 2011 09:15:45 -0700

On Mon, Sep 19, 2011 at 8:19 AM, Robert Klemme
<shortcutter@xxxxxxxxxxxxxx> wrote:
> On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
>
>> The other way to go of course is to try and fix up the existing hash
>> index code -- add wal logging, etc. In theory, a customized hash
>> structure should be able to beat btree all day long which argues to
>> continue in this direction.
>
> I still haven't seen a solution to locking when a hash table needs
> resizing.  All hashing algorithms I can think of at the moment would
> require a lock on the whole beast during the resize which makes this
> type of index impractical for certain loads (heavy updating).

The current implementation doesn't EX lock the whole beast during
resizing, except a brief one at the beginning (and maybe at the end?)
of the split.  It does EX lock the entire bucket being split for the
duration of the split, though.  The main problem that I see is that
due to the potential for deadlocks, the locks involved have to be
heavy-weight.  Which means the shared locks which non-splitting
processes have to use to block against those EX locks have to be
heavy-weight too, and getting even shared heavy-weight locks means
exclusive light-weight locks, which means contention.

One way out would be to have a special process (probably vacuum) do
all the resizing/splitting, rather than having regular backends doing
it.  It should be possible to make this dead-lock free.

Another would be to make hash indexes only support bit-map scans.
Then the scanning hash-code would never have a need to pass control
back to the executor while still holding a bucket lock.

Another would be to make the current position of the hash index scan
be "refindable" if a split should occur during the scan, so that you
don't need to inhibit splits during a scan of the same bucket.  This
would probably be easy if there were no overflow pages.   But the
overflow pages get shuffled in with each other and with the main
bucket page during a split.  It would take quite some gymnastics to
get around that.

Cheers,

Jeff

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance