Re: Any disadvantages of using =ANY(ARRAY()) instead of IN?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've encoutered similar issues myself (with UNION so far), so I tried
to build a simple test case, which may or may not cover Clemens's
case.

Test case 1 and 2 illustrates the issue, and case 3-9 are variations.

My observation: Looks like the optimizer cannot be close friends with
both UNION and IN/JOIN at the same time.
Actually - it looks like the UNION SELECT kids don't wanna share the
IN/JOIN toy we gave them, but are happy when they get their own toys
to play with ;)


DROP TABLE IF EXISTS table1;
CREATE TABLE table1 AS  SELECT i AS id FROM generate_series(1, 300000)
S(i);
CREATE INDEX ON table1(id);
ANALYZE table1;


-- Test 1: Slow. IN()
SELECT * FROM (
       SELECT * FROM table1
       UNION
       SELECT * FROM table1
) Q WHERE id IN (SELECT id FROM table1 LIMIT 10);


-- Test 2: Fast. ANY(ARRAY())
SELECT * FROM (
       SELECT * FROM table1
       UNION
       SELECT * FROM table1
) Q WHERE id = ANY(ARRAY(SELECT id FROM table1 LIMIT 10));






-- Test 3: Fast. Duplicate IN. Symptom fix? Or would you call it a
"better" query in terms of sql? -except for the unnecessary subquery,
which I kept for readability.
SELECT * FROM (
	SELECT * FROM table1
	WHERE id IN (SELECT id FROM table1 LIMIT 10)
	UNION
	SELECT * FROM table1
	WHERE id IN (SELECT id FROM table1 LIMIT 10)
) Q;


-- Test 4: Fast. Duplicate JOIN CTE.
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT * FROM (
     SELECT * FROM table1 JOIN id_list USING(id)
     UNION
     SELECT * FROM table1 JOIN id_list USING(id)
) Q;


-- Test 5: Slow. IN(CTE)
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT * FROM (
       SELECT * FROM table1
       UNION
       SELECT * FROM table1
) Q WHERE id IN (SELECT * FROM id_list);


-- Test 6: Slow. IN(explicit id list)
SELECT * FROM (
       SELECT * FROM table1
       UNION
       SELECT * FROM table1
) Q WHERE id IN (SELECT
UNNEST('{100001,100002,100003,100004,100005,100006,100007,100008,100009,10010}'::BIGINT[] )
AS id);


-- Test 7: Slow. IN(UNNEST(ARRAY())
SELECT * FROM (
       SELECT * FROM table1
       UNION
       SELECT * FROM table1
) Q WHERE id IN (SELECT UNNEST(ARRAY(SELECT id FROM table1 LIMIT 10))
AS id);


-- Test 8: Slow. JOIN CTE
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT * FROM (
      SELECT * FROM table1
      UNION
      SELECT * FROM table1
) Q JOIN id_list USING(id);


-- Test 9: Fast. JOIN CTE + UNION ALL/DISTINCT (not quite the same
query)
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT DISTINCT * FROM (
      SELECT * FROM table1
      UNION ALL
      SELECT * FROM table1
) Q JOIN id_list USING(id);


--
Geir Bostad
9.1.3(x64,win)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux