Re: In progress INSERT wrecks plans on table

Simon Riggs <simon@xxxxxxxxxxxxxxx> · Mon, 6 May 2013 09:14:01 +0100

On 6 May 2013 02:51, Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx> wrote:
> On 05/05/13 00:49, Simon Riggs wrote:
>>
>> On 3 May 2013 13:41, Simon Riggs <simon@xxxxxxxxxxxxxxx> wrote:
>>
>>> (3) to make the check on TransactionIdIsInProgress() into a heuristic,
>>> since we don't *need* to check that, so if we keep checking the same
>>> xid repeatedly we can reduce the number of checks or avoid xids that
>>> seem to be long running. That's slightly more coding than my quick
>>> hack here but seems worth it.
>>>
>>> I think we need both (1) and (3) but the attached patch does just (1).
>>>
>>> This is a similar optimisation to the one I introduced for
>>> TransactionIdIsKnownCompleted(), except this applies to repeated
>>> checking of as yet-incomplete xids, and to bulk concurrent
>>> transactions.
>>
>>
>> ISTM we can improve performance of TransactionIdIsInProgress() by
>> caching the procno of our last xid.
>>
>> Mark, could you retest with both these patches? Thanks.
>>
>
> Thanks Simon, will do and report back.

OK, here's a easily reproducible test...

Prep:
DROP TABLE IF EXISTS plan;
CREATE TABLE plan
(
  id INTEGER NOT NULL,
  typ INTEGER NOT NULL,
  dat TIMESTAMP,
  val TEXT NOT NULL
);
insert into plan select generate_series(1,100000), 0,
current_timestamp, 'some texts';
CREATE UNIQUE INDEX plan_id ON plan(id);
CREATE INDEX plan_dat ON plan(dat);

testcase.pgb
select count(*) from plan where dat is null and typ = 3;

Session 1:
pgbench -n -f testcase.pgb -t 100

Session 2:
BEGIN; insert into plan select 1000000 + generate_series(1, 100000),
3, NULL, 'b';

Transaction rate in Session 1: (in tps)
(a) before we run Session 2:
Current: 5600tps
Patched: 5600tps

(b) after Session 2 has run, yet before transaction end
Current: 56tps
Patched: 65tps

(c ) after Session 2 has aborted
Current/Patched: 836, 1028, 5400tps
VACUUM improves timing again

New version of patch attached which fixes a few bugs.

Patch works and improves things, but we're still swamped by the block
accesses via the index.

Which brings me back to Mark's original point, which is that we are
x100 times slower in this case and it *is* because the choice of
IndexScan is a bad one for this situation.

After some thought on this, I do think we need to do something about
it directly, rather than by tuning infrastructire (as I just
attempted). The root cause here is that IndexScan plans are sensitive
to mistakes in data distribution, much more so than other plan types.

The two options, broadly, are to either

1. avoid IndexScans in the planner unless they have a *significantly*
better cost. At the moment we use IndexScans if cost is lowest, even
if that is only by a whisker.

2. make IndexScans adaptive so that they switch to other plan types
mid-way through execution.

(2) seems fairly hard generically, since we'd have to keep track of
the tids returned from the IndexScan to allow us to switch to a
different plan and avoid re-issuing rows that we've already returned.
But maybe if we adapted the IndexScan plan type so that it adopted a
more page oriented approach internally, it could act like a
bitmapscan. Anyway, that would need some proof that it would work and
sounds like a fair task.

(1) sounds more easily possible and plausible. At the moment we have
enable_indexscan = off. If we had something like
plan_cost_weight_indexscan = N, we could selectively increase the cost
of index scans so that they would be less likely to be selected. i.e.
plan_cost_weight_indexscan = 2 would mean an indexscan would need to
be half the cost of any other plan before it was selected. (parameter
name selected so it could apply to all parameter types). The reason to
apply this weighting would be to calculate "risk adjusted cost" not
just estimated cost.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment:
cache_TransactionIdInProgress.v2.patch

Description: Binary data
-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance