At Thu, 6 Jan 2022 13:50:55 +0530, Vijaykumar Jain <vijaykumarjain.github@xxxxxxxxx> wrote in > On Thu, 6 Jan 2022 at 13:13, Avi Weinberg <AviW@xxxxxxxxx> wrote: > > > Hi > > > > > > > > I have postgres_fdw table called tbl_link. The source table is 2.5 GB in > > size with 122 lines (some lines has 70MB bytea column, but not the ones I > > select in the example) > > > > I noticed that when I put the specific ids in the list "where id in > > (140,144,148)" it works fast (few ms), but when I put the same list as > > select "where id in (select 140 as id union select 144 union select 148)" > > it takes 50 seconds. This select union is just for the example, I > > obviously have a different select (which by itself takes few ms but cause > > the whole insert query to take 10000x more time) > > > > > > > > Why is that? How can I still use regular select and still get reasonable > > response time? > > > > > > > > Thanks > > > > > > > > couple of things: > PostgreSQL: Documentation: 14: F.35. postgres_fdw > <https://www.postgresql.org/docs/current/postgres-fdw.html> > <https://www.postgresql.org/docs/current/postgres-fdw.html>when you set > your foreign server what are your > use_remote_estimate > fetch_size > params for the foreign server. > > you need to know there are certain restrictions on what gets pushed down to > the remote server > i generally use postgres/postgres_fdw.sql at master · postgres/postgres > (github.com) > <https://github.com/postgres/postgres/blob/master/contrib/postgres_fdw/sql/postgres_fdw.sql> > as > a reference > if you predicates are not pushed down, it will bring all the rows from the > foreign server to your local server (and fetch_size value and network io > will add to delay) > and given you used select * , it will be a lot of io, so maybe restrict > only to columns needed after being filtered would help. > > > you can try by running > explain (verbose,analyze) query and then also enabling log_statement = > 'all' / log_min_duration_statement = 0 > on the foreign server to see the actual plan for the foreign scan. > > That might help in trouble shooting. > > > as always, i have little production exposure. If i am wrong, i can be > corrected. In this specific case, the FAST query doesn't contain a join and its predicate can be pushed down to remote. On the other hand the SLOW one contains a join. The planner considers remote join only when the both hands of a join are on the same foreign server. Tthis is not the case since the inner subquery is not even a foreign scan. The planner doesn't consider the possibility that a subquery is executable anywhere. As the result, the local inevitably draw all rows from remote table to join with the result of the subquery on-local, which should be quite slow. It could be improved, but I don't think we are going to consider that case because the SLOW query seems like a kind of bad query, which can be improved by rewriting to the FAST one. regards. -- Kyotaro Horiguchi NTT Open Source Software Center