Hibernate generated query slow compared to 'equivalent' hand written one

Kyle Moser <moser.kyle@xxxxxxxxx> · Fri, 14 Oct 2016 13:27:04 -0400

The depesz link for explain (analyze, buffers) is shown below for 3 different queries. The first two queries show a log dump of the postgres log, showing a query that was generated by Java Hibernate. The third query was one I wrote and ran in pgadmin that I think is similar to what Hibernate is doing. You can see in the first link query plan output that the query shown below is doing a sequential scan on REMINDERS which I do not understand since REMINDERS has an index on FK_USER. That isn't the focus of my question. In the second query I forced it to use the index but it wasn't any faster. Now, if I go into pgadmin and execute what I thought was an equivalent query it is much faster (less than 2ms vs. 2000 ms). My question is basically: why is the query I wrote so much faster than the hibernate ones, when the query appears essentially the same?

Schema for REMINDERS table (SQL shown):

CREATE TABLE pgsch.reminders
(
    id bigint NOT NULL,
    fk_user bigint NOT NULL,
    date timestamp without time zone NOT NULL,
    finished character(1) COLLATE "default".pg_catalog NOT NULL DEFAULT 'F'::bpchar,
    subject character varying(255) COLLATE "default".pg_catalog NOT NULL,
    is_on character(1) COLLATE "default".pg_catalog NOT NULL DEFAULT 'T'::bpchar,
    body text COLLATE "default".pg_catalog,
    CONSTRAINT reminders_pkey PRIMARY KEY (id),
    CONSTRAINT usr_cstr FOREIGN KEY (fk_user)
        REFERENCES pgsch.user (id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

CREATE INDEX remidx
    ON pgsch.reminders USING btree
    (fk_user)
    TABLESPACE pg_default;

__________________________________________________________________

Additional info:
PostgreSQL 9.5.3
reminders table has 66K rows
Settings used in postgres conf: 
auto_explain.log_analyze = true, auto_explain.log_buffers = true, track_io_timing = true, work_mem = 256MB, shared_buffers = 8GB, effective_cache_size = 25GB

Additional info from the postgres log for the first query:

2016-10-14 11:23:38.107 EDT >LOG:  duration: 0.798 ms  parse <unnamed>: select distinct this_.FK_USER as y0_ from pgsch.REMINDERS this_ where this_.FK_USER in ($1, ..., $999)
2016-10-14 11:23:38.109 EDT >LOG:  duration: 1.164 ms  bind <unnamed>: select distinct this_.FK_USER as y0_ from pgsch.REMINDERS this_ where this_.FK_USER in ($1, ..., $999)

 >DETAIL:  parameters: $1 = '213', $2 = '382', $3 = '131', $4 = '174', $5 = '885', ..., $992 = '830', $993 = '333', $994 = '414', $995 = '481', $996 = '454', $997 = '728', $998 = '281', $999 = '717'
  duration: 1571.404 ms  execute <unnamed>: select distinct this_.FK_USER as y0_ from pgsch.REMINDERS this_ where this_.FK_USER in ($1, ..., $999)
  parameters: $1 = '213', $2 = '382', $3 = '131', $4 = '174', $5 = '885', ..., $992 = '830', $993 = '333', $994 = '414', $995 = '481', $996 = '454', $997 = '728', $998 = '281', $999 = '717'

 2016-10-14 11:23:39.682 EDT >LOG:  duration: 1571.388 ms  plan:
        Query Text: select distinct this_.FK_USER as y0_ from pgsch.REMINDERS this_ where this_.FK_USER in ($1, ..., $999)
		https://explain.depesz.com/s/nxMH

Then I put an additional setting in my postgres.conf (enable_seqscan = off) and got this for the second query:
https://explain.depesz.com/s/1UE0

 So as you can see it didn't really improve anything even though it was using the index.

 However, if I run the following query in pgadmin: "explain(analyze, buffers) select distinct this_.FK_USER as y0_ from pgsch.REMINDERS this_ where this_.FK_USER in (1...1000);" 
 I get the following output. Note that 1...1000 here are 1000 values I generated. I ran this a bunch of times and the randomness or values themselves don't seem to make much difference (always faster than 2ms). I also ran this query as a prepared statement with no noticeable difference, since I figured that's what Hibernate is doing.
 https://explain.depesz.com/s/EEb

Why does the third query (pgadmin one) show 270 for the rows whereas the query from the postgres log (initiated by hibernate) shows 65597 rows, which is close to the real number of rows? 
Any other tips why the hibernate query is so slow compared to the pgadmin one?  I am totally stumped here and out of ideas.

Thanks for any help or suggestions anyone can provide.