Re: Questions about btree_gin vs btree_gist for low cardinality columns

Tom Lane <tgl@xxxxxxxxxxxxx> · Sun, 02 Jun 2019 19:07:14 -0400

Jeff Janes <jeff.janes@xxxxxxxxx> writes:
> On Fri, May 24, 2019 at 11:26 AM Jeremy Finzel <finzelj@xxxxxxxxx> wrote:
>> I have been hoping for clearer direction from the community about
>> specifically btree_gin indexes for low cardinality columns (as well as low
>> cardinality multi-column indexes).  In general there is very little
>> discussion about this both online and in the docs.  Rather, the emphasis
>> for GIN indexes discussed is always on full text search of JSON indexing,
>> not btree_gin indexes.

I just wanted to mention that Jeremy and I had a bit of hallway-track
discussion about this at PGCon.  The core thing to note is that the GIN
index type was designed to deal with data types that are subdividable
and you want to search for individual component values (array elements,
lexemes in a text document, etc).  The btree_gin extension abuses this
by just storing the whole values as if they were components.  AFAIR,
the original idea for writing both btree_gin and btree_gist was to allow
creating a single multicolumn index that covers both subdividable and
non-subdividable columns.  The idea that btree_gin might be used on its
own wasn't really on the radar, I don't think.

However, now that GIN can compress multiple index entries for the same
component value (which has only been true since 9.4, whereas btree_gin
is very much older than that) it seems like it does make sense to use
btree_gin on its own for low-cardinality non-subdividable columns.
And that means that we ought to consider non-subdividable columns as
fully legitimate, not just a weird corner usage.  So in particular
I wonder whether it would be worth adding the scaffolding necessary
to support index-only scan when the GIN opclass is one that doesn't
subdivide the data values.

That leaves me quibbling with some points in Jeff's otherwise excellent
reply:

> For single column using a btree_gin operator, each index entry holds the
> entire data value.  But the system doesn't know about that in any useful
> way, so it doesn't implement index only scans, other than in the special
> case where the value does not matter, like a 'count(*)'.  Conceptually
> perhaps this could be fixed, but I don't see it happening. Since an
> index-only scan is usually not much good with only a single-column index, I
> don't see much excitement to improve things here.

I'm confused by this; surely IOS is useful even with a single-column
index?  Avoiding trips to the heap is always helpful.

> If if there is more than
> one column in the index, then for GIN the entire value from both columns
> would not be stored in the same index entry, so in this case it can't use
> an index-only scan even conceptually to efficiently fetch the value of one
> column based on the supplied value of another one.

Yeah, that is a nasty issue.  You could do it, I think, by treating the
additional column as being queried even though there is no WHERE
constraint on it --- but that would lead to scanning all the index entries
which would be very slow.  Maybe still faster than a seqscan though?
But I suspect we'd end up wanting to extend the notion of "IOS" to say
that GIN can only return columns that have an indexable constraint,
which is a little weird.  At the very least we'd have to teach
gincostestimate about this, or we could make very bad plan choices.

Anyway, I said to Jeremy in the hallway that it might not be that
hard to bolt IOS support onto GIN for cases where the opclass is
a non-subdividing one, but after looking at the code I'm less sure
about that.  GIN hasn't even got an "amgettuple" code path, just
"amgetbitmap", and a big part of the reason why is the need to merge
results from the fastupdate pending list with results from the main
index area.  Not sure how we could deal with that.

> GIN indexes over btree_gin operators do not support inequality or BETWEEN
> queries efficiently.

Are you sure about that?  It's set up to use the "partial match" logic,
which is certainly pretty weird, but it does have the potential for
handling inequalities efficiently.  [ pokes at it ... ]  Hm, looks like
it does handle individual inequality conditions reasonably well, but
it's not smart about the combination of a greater-than and a less-than
constraint on the same column.  It ends up scanning all the entries
satisfying the one condition, plus all the entries satisfying the other,
then intersecting those sets --- which of course comprise the whole table
:-(.  I think maybe this could be made better without a huge amount of
work though.

> ... I do recall that
> the replay of GIN WAL  onto a standby was annoying slow, but I haven't
> tested it with your particular set up and not in the most recent versions.

A quick look at the commit log says that some work was done in this area
for 9.4, and more in v12, but I've not tried to measure the results.

Anyway, the larger point here is that right now btree_gin is just a quick
hack, and it seems like it might be worth putting some more effort into
it, because the addition of duplicate-compression changes the calculus
for whether it's useful.

			regards, tom lane