On Tue, Apr 4, 2017 at 8:20 AM, Daniel Westermann <daniel.westermann@xxxxxxxxxxxxxxxx> wrote: > Hi, > > PostgreSQL 9.6.2 on CentOS 7.3 x64. > > This is my data set: > > drop table if exists users; > drop table if exists ids; > create table users ( user_id int > , username varchar(50) > ); > with generator as > ( select a.* > from generate_series (1,3000000) a > order by random() > ) > insert into users ( user_id > , username > ) > select a > , md5(a::varchar) > from generator; > create unique index i_users on users ( user_id ); > create table ids ( id int ); > insert into ids (id) values ( generate_series ( 2500000, 3500000 ) ); > create unique index i_ids on ids ( id ); > analyze users; > analyze ids; > > I have set work_mem to a very low value intentionally for demonstration > purposes: > > postgres=# show work_mem; > work_mem > ---------- > 16MB > (1 row) > > postgres=# show shared_buffers ; > shared_buffers > ---------------- > 128MB > (1 row) > > > When I run the following query ( I know that "not in" is not a good choice > here ): > > postgres=# select count(user_id) from users where user_id not in ( select id > from ids); "NOT IN" where the predate is a table column can lead to very poor query plans especially where the haystack is not provably known (at plan time) to contain only not null values. By reducing work_mem, the server has decided has to repeatedly search the table to search for the presence of null values. Try converting the query to NOT EXISTS. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general