Anti join miscalculates row number?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I?ve got a lengthy query, that doesn't finish in reasonable time (i.e.
10min+). I suspect, that the query optimizer miscalculates the number of
rows for part of the query.

The suspicious subquery:

SELECT
				sv1.sid as sid
			FROM stud_vera sv1 
			LEFT JOIN stud_vera AS sv2 
			ON (
				sv1.sid=sv2.sid 
				AND sv2.veraid IN ( 109 )
				) 
			WHERE sv1.veraid IN ( 3 ) 
			AND sv2.veraid IS NULL

The whole query:

SELECT count(DISTINCT sid) AS Anzahl FROM (SELECT sid
                        FROM stud
                        WHERE (
                                        status IN (1,2)
                                        AND length(vname) > 1
                                        AND length(nname) > 1
					AND length(email) > 1
                                        
                                )) AS stud INNER JOIN (SELECT DISTINCT
                                sid,
                                cast(created AS date) AS tag,
                                cast(veradate AS DATE) -
cast(stud_vera.created AS DATE) AS tage,
                                cast(
                                        floor(
                                                (cast(veradate AS date) -
cast(stud_vera.created AS date))/7
                                        )
                                AS integer) AS woche,
                                cast(extract(week from stud_vera.created) AS
integer) AS kalenderwoche,
                                to_char(stud_vera.created, 'YYYY/MM') AS
monat,
                                to_char(stud_vera.abgemeldet, 'YYYY/MM') AS
abmeldemonat,
                                CASE
                                        WHEN newsletterid &1 = 1 THEN 'Flag
1'
                                        WHEN newsletterid &2 = 2 THEN 'Flag
2'
                                        WHEN newsletterid &4 = 2 THEN 'Flag
3'
                                        WHEN newsletterid &8 = 8 THEN 'Flag
4'
                                        WHEN newsletterid &16 = 16 THEN
'Flag 5'
                                        WHEN newsletterid &32 = 32 THEN
'Flag 6'
                                        WHEN newsletterid &64 = 64 THEN
'Flag 7'
                                        WHEN newsletterid &128 = 128 THEN
'Flag 8'
                                END AS newsletterid
                        FROM stud_vera
                        JOIN vera USING (veraid)
                        WHERE


                        stud_vera.status > 0
                        AND abgemeldet is null




                        AND veraid IN (

                                        3

                        )) AS vera USING (sid)  INNER JOIN (SELECT
                        sid,
                        age(date_trunc('MONTH', now()), date_trunc('MONTH',
bis)) || '' AS months
                FROM
                        study
                WHERE status = 1

                                AND

                                        age(date_trunc('MONTH', now()),
date_trunc('MONTH', bis)) < interval '60 months'


                                        AND


                                        age(date_trunc('MONTH', bis),
date_trunc('MONTH', now())) <= interval '-24 months') AS examen USING (sid)
INNER JOIN (SELECT
                                sv1.sid as sid
                        FROM stud_vera sv1
                        LEFT JOIN stud_vera AS sv2
                        ON (
                                sv1.sid=sv2.sid
                                AND sv2.veraid IN ( 109 )
                                )
                        WHERE sv1.veraid IN ( 3 )
                        AND sv2.veraid IS NULL) AS veraAusschluss USING
(sid)


And the explain analyze for the sub query: http://explain.depesz.com/s/8d2 
And the explain for the whole query: http://explain.depesz.com/s/GGf
(explain analyze doesn't finish in reasonable time)

What strucks me, is that in the sub query row numbers for sv1 and sv2 are
calculated quite accurat. But the resulting 4 rows after the final join is
far from reality. Shouldn't this be as minimum the number of rows for sv1
minus number of rows for sv2?

If the optimizer knew, that the number is much bigger, it probably wouldn't
choose the nested loop in the next step, which I suspect is the reason for
the performance issues.

We're using postgres 9.0.4. It might be interesting, that the same query
runs smoothly on our test system with postgres 8.3.7.

The tables:

                                           Tabelle »public.stud_vera«
      Spalte       |             Typ             |
Attribute
-------------------+-----------------------------+--------------------------
------------------------------------
 svid              | integer                     | not null Vorgabewert
nextval('stud_vera_svid_seq'::regclass)
 sid               | integer                     |
 veraid            | integer                     |
 modified          | timestamp without time zone | not null Vorgabewert
now()
 created           | timestamp without time zone | not null Vorgabewert
now()
 verastep1         | timestamp without time zone |
 kontoinhaber      | character varying(64)       |
 kontonum          | character varying(32)       |
 blz               | character varying(32)       |
 bank              | character varying(64)       |
 betrag            | numeric(5,2)                |
 verastep2         | timestamp without time zone |
 deferred          | smallint                    |
 verastep3         | timestamp without time zone |
 status            | smallint                    | not null
 label             | character varying(128)      |
 kanalid           | integer                     |
 deferredtxt       | character varying(256)      |
 comment           | character varying(64)       |
 label2            | character varying(128)      |
 dstid             | integer                     |
 abgemeldet        | date                        |
 bsid              | integer                     | Vorgabewert 1
 newsletterid      | integer                     |
 abmeldenewsletter | integer                     | not null Vorgabewert 0
 kanalcomment      | character varying(128)      |
Indexe:
    "stud_vera_pkey" PRIMARY KEY, btree (svid)
    "stud_vera_sid_veraid_idx" UNIQUE, btree (sid, veraid)
    "stud_vera_sid_idx" btree (sid)
    "stud_vera_veraid_idx" btree (veraid)
Fremdschlüssel-Constraints:
    "$1" FOREIGN KEY (sid) REFERENCES stud(sid) ON UPDATE CASCADE ON DELETE
CASCADE
    "$2" FOREIGN KEY (veraid) REFERENCES vera(veraid) ON UPDATE CASCADE ON
DELETE SET NULL
    "stud_vera_dstid" FOREIGN KEY (dstid) REFERENCES datenschutztext(dstid)
ON UPDATE CASCADE ON DELETE SET NULL
Fremdschlüsselverweise von:
    TABLE "eingang" CONSTRAINT "eingang_svid_fkey" FOREIGN KEY (svid)
REFERENCES stud_vera(svid) ON UPDATE CASCADE ON DELETE CASCADE

                                        Tabelle »public.stud«
    Spalte     |             Typ             |
Attribute
---------------+-----------------------------+------------------------------
--------------------------
 sid           | integer                     | not null Vorgabewert
nextval('stud_sid_seq'::regclass)
 login         | character varying(64)       | not null
 passwd        | character varying(32)       |
 modified      | timestamp without time zone | not null Vorgabewert now()
 created       | timestamp without time zone | not null Vorgabewert now()
 lastlogin     | timestamp without time zone |
 mow           | smallint                    |
 titel         | character varying(32)       |
 vname         | character varying(32)       |
 nname         | character varying(32)       |
 birth         | date                        |
 einstieg      | date                        |
 blacksheep    | integer                     |
 studstatusid  | integer                     |
 status        | smallint                    | not null
 studmodified  | timestamp without time zone |
 adminmodified | timestamp without time zone |
 comment       | character varying(128)      |
 dstid         | integer                     |
 linkid        | integer                     |
 beesiteuserid | integer                     |
 ypdate        | date                        |
 email         | character varying(64)       |
 flag          | smallint                    |
Indexe:
    "stud_pkey" PRIMARY KEY, btree (sid)
    "stud_login_idx" UNIQUE, btree (login)
    "stud_login_lower" btree (lower(login::text))
    "stud_nname_idx" btree (lower(nname::text))
    "stud_sid_status_idx" btree (sid, status)
    "stud_vname_idx" btree (lower(vname::text))
Check-Constraints:
    "birth" CHECK (birth >= '1900-01-01'::date AND birth <=
'1999-12-31'::date)
Fremdschlüssel-Constraints:
    "stud_dstid_fkey" FOREIGN KEY (dstid) REFERENCES datenschutztext(dstid)
ON UPDATE CASCADE ON DELETE SET NULL
Fremdschlüsselverweise von:
    TABLE "stud_vera" CONSTRAINT "$1" FOREIGN KEY (sid) REFERENCES stud(sid)
ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_wdwolle" CONSTRAINT "$1" FOREIGN KEY (sid) REFERENCES
stud(sid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_staats" CONSTRAINT "$1" FOREIGN KEY (sid) REFERENCES
stud(sid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "locking" CONSTRAINT "$1" FOREIGN KEY (sid) REFERENCES stud(sid)
ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_ad" CONSTRAINT "$1" FOREIGN KEY (sid) REFERENCES stud(sid)
ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "gutschein" CONSTRAINT "$2" FOREIGN KEY (sid) REFERENCES stud(sid)
ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "kontakt" CONSTRAINT "kontakt_sid_fkey" FOREIGN KEY (sid)
REFERENCES stud(sid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_ae" CONSTRAINT "stud_ae_sid_fkey" FOREIGN KEY (sid)
REFERENCES stud(sid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_berufsfeld" CONSTRAINT "stud_berufsfeld_fk_sid" FOREIGN KEY
(sid) REFERENCES stud(sid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_einstiegsbereich" CONSTRAINT
"stud_einstiegsbereich_fkey_sid" FOREIGN KEY (sid) REFERENCES stud(sid) ON
UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_vakanzen" CONSTRAINT "stud_vakanzen_sid_fkey" FOREIGN KEY
(sid) REFERENCES stud(sid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_vposition" CONSTRAINT "stud_vposition_sid_fkey" FOREIGN KEY
(sid) REFERENCES stud(sid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "study" CONSTRAINT "study_sid_fkey" FOREIGN KEY (sid) REFERENCES
stud(sid) ON UPDATE CASCADE ON DELETE CASCADE

                                         Tabelle »public.vera«
    Spalte    |             Typ             |
Attribute
--------------+-----------------------------+-------------------------------
----------------------------
 veraid       | integer                     | not null Vorgabewert
nextval('vera_veraid_seq'::regclass)
 vera         | character varying(64)       |
 verakurz     | character varying(32)       |
 vera_e       | character varying(8)        |
 vera_e2      | character varying(8)        |
 veratyp      | smallint                    |
 veradate     | date                        |
 veradauer    | integer                     |
 veraort      | character varying(32)       |
 veraland     | character varying(32)       |
 veracomment  | character varying(255)      |
 active       | smallint                    |
 status       | smallint                    |
 landid       | integer                     |
 spontandate  | date                        |
 spontandate2 | date                        |
 dstid        | integer                     |
 xmlconf      | character varying(128)      |
 verakurz2    | character varying(32)       |
 closingdate  | timestamp without time zone |
 url          | character varying(128)      |
 urltext      | character varying(32)       | Vorgabewert 'Zum
Event'::character varying
 etflag       | integer                     |
Indexe:
    "vera_pkey" PRIMARY KEY, btree (veraid)
    "vera_verakurz_unique" UNIQUE, btree (verakurz)
Fremdschlüssel-Constraints:
    "vera_dstid" FOREIGN KEY (dstid) REFERENCES datenschutztext(dstid) ON
UPDATE CASCADE ON DELETE SET NULL
Fremdschlüsselverweise von:
    TABLE "vera_reihe" CONSTRAINT "$1" FOREIGN KEY (veraid) REFERENCES
vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "gutschein" CONSTRAINT "$1" FOREIGN KEY (veraid) REFERENCES
vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "prod_vera" CONSTRAINT "$2" FOREIGN KEY (veraid) REFERENCES
vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "stud_vera" CONSTRAINT "$2" FOREIGN KEY (veraid) REFERENCES
vera(veraid) ON UPDATE CASCADE ON DELETE SET NULL
    TABLE "auswahlevent" CONSTRAINT "auswahlevent_veraid_fkey" FOREIGN KEY
(veraid) REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "inside" CONSTRAINT "fk_inside_veraid" FOREIGN KEY (veraid)
REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "fprofil" CONSTRAINT "fprofil_veraid_fkey" FOREIGN KEY (veraid)
REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "mailversand" CONSTRAINT "mailversand_veraid_fkey" FOREIGN KEY
(veraid) REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "mailvorlage" CONSTRAINT "mailvorlage_veraid_fkey" FOREIGN KEY
(veraid) REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "raum" CONSTRAINT "raum_veraid_fkey" FOREIGN KEY (veraid)
REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "schiene" CONSTRAINT "schiene_veraid_fkey" FOREIGN KEY (veraid)
REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "vakanzen" CONSTRAINT "vakanzen_veraid_fkey" FOREIGN KEY (veraid)
REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "vposition" CONSTRAINT "vposition_veraid_fkey" FOREIGN KEY
(veraid) REFERENCES vera(veraid) ON UPDATE CASCADE ON DELETE CASCADE

                                               Tabelle »public.study«
       Spalte        |              Typ               |
Attribute
---------------------+--------------------------------+---------------------
----------------------------------------
 studyid             | integer                        | not null Vorgabewert
nextval('study_studyid_seq'::regclass)
 sid                 | integer                        | not null
 modified            | timestamp(0) without time zone | not null Vorgabewert
now()
 created             | timestamp(0) without time zone | not null Vorgabewert
now()
 abschlusstypid      | integer                        |
 uniid               | integer                        |
 von                 | date                           |
 bis                 | date                           |
 unisonstige         | character varying(128)         |
 unilandid           | integer                        |
 ausrichtungsonstige | character varying(64)          |
 vertiefungsonstige  | character varying(64)          |
 qnoteid             | integer                        |
 status              | smallint                       | not null Vorgabewert
1
Indexe:
    "study_pkey" PRIMARY KEY, btree (studyid)
    "study_sid_idx" btree (sid)
Fremdschlüssel-Constraints:
    "study_sid_fkey" FOREIGN KEY (sid) REFERENCES stud(sid) ON UPDATE
CASCADE ON DELETE CASCADE
Fremdschlüsselverweise von:
    TABLE "study_ausrichtung" CONSTRAINT "study_ausrichtung_studyid_fkey"
FOREIGN KEY (studyid) REFERENCES study(studyid) ON UPDATE CASCADE ON DELETE
CASCADE
    TABLE "study_vertiefung" CONSTRAINT "study_vertiefung_fkey1" FOREIGN KEY
(studyid) REFERENCES study(studyid) ON UPDATE CASCADE ON DELETE CASCADE

Many thanks

-- 
Jens Reufsteck



-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux