Thorsten Schöning <tschoening@xxxxxxxxxx> writes: > Hi all, > > I have lots of queries in which I need to restrict access to rows > using some decimal row-ID and am mostly doing so with using the > operator IN in WHERE-clauses. Additionally I'm mostly embedding the > IDs as ","-seperated list into the query directly, e.g. because I > already hit a limitation of ~32k parameters of the JDBC-driver[1] for > Postgres. > > I really thought that in most cases simply sending a large amount of > IDs embedded into the query is better than looping, because it safes > roundtrips to access the DB, the planner of the DB has all pieces of > information it needs to decide best strategies etc. OTOH, with recent > tests and an increased number of IDs of about factor 100, I have > additional load in Tomcat before actually sending the query to the DB > already and in the DB itself as well of course. I've attached an > example query and plan. > >> -> Hash (cost=242592.66..242592.66 rows=6825 width=39) (actual time=91.117..91.117 rows=40044 loops=3) >> Buckets: 65536 (originally 8192) Batches: 1 (originally 1) Memory Usage: 3016kB >> -> Hash Join (cost=137.57..242592.66 rows=6825 width=39) (actual time=10.194..82.412 rows=40044 loops=3) >> Hash Cond: (meter.meter_bcd = meter_bcd.id) >> -> Index Scan using pk_meter on meter (cost=0.42..242237.10 rows=40044 width=25) (actual time=9.350..71.276 rows=40044 loops=3) >> Index Cond: (id = ANY ('{[...]}'::integer[])) >> -> Hash (cost=135.73..135.73 rows=113 width=22) (actual time=0.830..0.830 rows=113 loops=3) > > Do you know of any obvious limitations of the JDBC-driver of handling > such large queries? In the end, the query is mostly large text with > only very few bind parameters. > > Do you know of any obvious problem in Postgres itself with that query, > when parsing it or alike? Do things simply take how long they take and > are mostly comparable to looping or is there some additional overhead > the larger the query itself gets? From my naive expectation, comparing > IDs shouldn't care if things get looped or transmitted at once. > > I'm just trying to collect some input for where to look at to optimize > things in the future. Thanks! > > [1]: https://github.com/pgjdbc/pgjdbc/issues/90 > > Mit freundlichen Grüßen, > > Thorsten Schöning It would help to see the query as well as the plan. Where are these 100s of IDs coming from? I sometimes find this a sign you could be re-structuring your query to be a join between two tables where one table contains the IDs of interest rather than trying to embed them into the query as part of a where clause. -- Tim Cross