BUG: endless lseek(.., SEEK_END) from select queries on x64 builds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hello,

i have a strange and reproducible bug with some select queries and 64bit 
postgresql builds (works fine on 32bit builds).
The postgres process will run with 100% cpu-load (no io-wait) and strace 
will show endless lseek(..., SEEK_END) calls on one table for minutes.
lseek(28, 0, SEEK_END)                  = 26697728
lseek(28, 0, SEEK_END)                  = 26697728
lseek(28, 0, SEEK_END)                  = 26697728
...
the file-descriptor 28 points to the file for the webapps_base.Acquisition 
table (see query/plan below).

Now the details:

The Query:
select count(ac.ID) as col_0_0_ from 
	webapps_base.Acquisition ac, 
	webapps_base.SalesPartnerStructure struc
	where 
		struc.fk_SalesPartner_child=ac.fk_SalesPartner_ID 
		and struc.fk_SalesPartner_parent=200
		and (ac.CreationDate between '2012-02-01' and '2013-01-31') 
		and ac.acquisitiondepot='STANDARD' 
		and ('2013-01-31' between struc.ValidFrom 
			and coalesce(struc.ValidTo, '2013-01-31'))

The Plan:
"Aggregate  (cost=32617.11..32617.12 rows=1 width=8) (actual time=204.180..204.180 rows=1 loops=1)"
"  ->  Merge Join  (cost=32232.01..32598.26 rows=7543 width=8) (actual time=172.882..202.218 rows=21111 loops=1)"
"        Merge Cond: (ac.fk_salespartner_id = struc.fk_salespartner_child)"
"        ->  Sort  (cost=5582.60..5635.69 rows=21235 width=16) (actual time=28.920..31.121 rows=21204 loops=1)"
"              Sort Key: ac.fk_salespartner_id"
"              Sort Method: quicksort  Memory: 1763kB"
"              ->  Bitmap Heap Scan on acquisition ac  (cost=395.26..4056.43 rows=21235 width=16) (actual time=3.064..15.868 rows=21223 loops=1)"
"                    Recheck Cond: ((creationdate >= '2012-02-01'::date) AND (creationdate <= '2013-01-31'::date))"
"                    Filter: ((acquisitiondepot)::text = 'STANDARD'::text)"
"                    ->  Bitmap Index Scan on index_acquistion_creationdate  (cost=0.00..389.95 rows=21267 width=0) (actual time=2.890..2.890 rows=21514 loops=1)"
"                          Index Cond: ((creationdate >= '2012-02-01'::date) AND (creationdate <= '2013-01-31'::date))"
"        ->  Sort  (cost=26648.60..26742.61 rows=37606 width=8) (actual time=143.952..152.808 rows=131713 loops=1)"
"              Sort Key: struc.fk_salespartner_child"
"              Sort Method: quicksort  Memory: 8452kB"
"              ->  Bitmap Heap Scan on salespartnerstructure struc  (cost=3976.80..23790.79 rows=37606 width=8) (actual time=13.279..64.681 rows=114772 loops=1)"
"                    Recheck Cond: (fk_salespartner_parent = 200)"
"                    Filter: (('2013-01-31'::date >= validfrom) AND ('2013-01-31'::date <= COALESCE(validto, '2013-01-31'::date)))"
"                    ->  Bitmap Index Scan on index_parent_salespartner  (cost=0.00..3967.39 rows=114514 width=0) (actual time=13.065..13.065 rows=116479 loops=1)"
"                          Index Cond: (fk_salespartner_parent = 200)"
"Total runtime: 205.397 ms"

as you can see the query runs fine. 
I can run this query from a bash-psql-while-loop/jdbc-cli-tool 
endless without any problems. 
so far so good.

But now i run the same query from:

JBoss EAP 5.1.2 with connection pooling and xa-datasource/two-phase-commits 
(transactions on multiple datasources needed)
*and* *<prepared-statement-cache-size>1000</prepared-statement-cache-size>*

i can run the query four times with good performance and after that postgresql 
starts with the strange lseek() behavior. 
The query needs more then a minute to complete and during execution the 
postgres process runs at 100% cpu load with lseek calls (straced).
If i flush the connection pool (close all open connections from the jboss 
jmx-console) it works again for four calls.
These problem applies only to 64bit builds. If i run a 32bit postgresql 
server it works fine.

We have tested the following environments:

- Debian Squeeze 64bit with Postgresql 9.1.[5,6,7] -> Bad behavior
- Debian Wheezy 64bit with Postgresql 9.1.8 64bit -> Bad behavior
- Ubuntu 12.04 LTS 64bit with Postgresql 9.1.8 64bit -> Bad behavior
- Windows 7 x64 with Postgresql 9.1.8 64bit -> Bad behavior
- Debian Wheezy 64bit with EnterpriseDB 9.2 64bit -> Bad behavior

- Debian Wheezy 64bit with Postgresql 9.1.8 32bit -> Good behavior
- Debian Wheezy 32bit with Postgresql 9.1.8 32bit -> Good behavior

as you can see all 64bit builds of postgresql are affected (independent from os-arch).

If i disable the prepared-statement-cache-size (remove it from -ds.xml) 
it works on 64bit build too.

regards,
msc


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux