Thank you all so much for the feedback.
At this point, I'm convinced that the issue is more complicated than I originally thought :)
FWIW, my use case is for a company internal database. I open the database up to all users by simply having a "readonly" user that anyone can use to connect to the database and run queries. Some tables have sensitive data in them that I would prefer not to allow users to simply have access to via this account. However, that said, there are internal people that we want to empower to develop our internal tool and so the idea is that they can take a database dump using the readonly user to replicate the schema in their own instance.
I realize this usage may be quite specific to our company, but it seemed like the issue (as an outsider) was very simple.
Just want to reiterate my thanks for taking the time to look into this issue and consider it. This was my first interaction with the Postgres community and it was an overwhelmingly good one!
- Dan
On Wed, Feb 18, 2015 at 6:01 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Stephen Frost <sfrost@xxxxxxxxxxx> writes:
> * Tom Lane (tgl@xxxxxxxxxxxxx) wrote:
>> This is the standard mistake about pg_dump, which is to imagine that it
>> depends only on userspace operations while inspecting schema info. It
>> doesn't; it makes use of things like ruleutils.c which operate on "latest
>> available data" rules.
> There's two different points here- the first is the whole discussion
> around why pg_dump is depending on the backend for bits and pieces but
> not everything, but the second is- aren't the accesses from ruleutils.c
> now using an MVCC snapshot?
Yeah, they're using *an* MVCC snapshot. But it's not the transaction
snapshot, it's one that postdates all sinval traffic the backend has
received. Robert's changes to get rid of SnapshotNow didn't really
affect this issue at all. (To clarify: I'm worried about all the stuff
that involves syscache consultations; those queries executed via SPI
are not the issue.)
It now strikes me that it might be possible to use Andreas' logical
decoding infrastructure to allow pg_dump's backend to operate with a
"historical catalog snapshot", which perhaps could resolve this problem.
But there's no such logic there today, and I have no idea what the
disadvantages might be.
> Certainly there's a comment about that
> happening for pg_get_constraintdef_worker(), and other parts appear to
> go through SPI, but not everything does.
Yeah, Jan originally had a plan of making ruleutils operate exclusively
through SPI, but that lasted probably about a month and a half before
people started using syscache-accessing shortcuts. I think we really
would be best off to eliminate the SPI usage there altogether; it has
little effect except to waste cycles and mislead the credulous into
thinking ruleutils operates in userspace.
regards, tom lane