Re: NOTIFY performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff Janes <jeff.janes@xxxxxxxxx> writes:
> I wonder if should be trying to drop duplicates at all.  I think that
> doing that made a lot more sense before payloads existed.

Perhaps, but we have a lot of history to be backwards-compatible with.

> The docs said that the system "can" drop duplicates, so making it no
> longer do so would be backwards compatible.

Maybe compatible from a language-lawyerly point of view, but the
performance characteristics would be hugely different - and since this
complaint is entirely about performance, I don't think it's fair to
ignore that.  We'd be screwing people who've depended on the historical
behavior to accommodate people who expect something that never worked
well before to start working well.

The case that I'm specifically worried about is rules and triggers that
issue NOTIFY without worrying about generating lots of duplicates when
many rows are updated in one command.

> Maybe drop duplicates where the payload was the empty string, but keep
> them otherwise?

Maybe, but that seems pretty weird/unpredictable.  (In particular, if
you have a mixed workload with some of both types of notify, you lose
twice: some of the inserts will need to scan the list, so that cost
is still quadratic, but you still have a huge event list to dump into
the queue when the time comes.)

I seem to recall that we discussed the idea of checking only the last N
notifies for duplicates, for some reasonably small N (somewhere between
10 and 100 perhaps).  That would prevent the quadratic behavior and yet
also eliminate dups in most of the situations where it would matter.
Any N>1 would require a more complicated data structure than is there
now, but it doesn't seem that hard.

The other thing we'd need to find out is whether that's the only problem
for generating bazillions of notify events per transaction.  It won't
help to hack AsyncExistsPendingNotify if dropping the events into the
queue is still too expensive.  I am worried about the overall processing
cost here, consumers and producers both.

			regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux