On Mon, Oct 17, 2011 at 10:28 AM, Alban Hertroys <haramrae@xxxxxxxxx> wrote: > On 17 October 2011 16:24, Merlin Moncure <mmoncure@xxxxxxxxx> wrote: >> On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@xxxxxxxxx> wrote: >>> On 17 October 2011 15:20, Merlin Moncure <mmoncure@xxxxxxxxx> wrote: >>>> A better way to do this is to query information_schema: >>>> >>>> PERFORM 1 FROM information_schema.tables where schema_name = x and >>>> table_name = y; >>>> >>>> IF FOUND THEN >>>> CREATE TABLE ... >>>> END IF; >>>> >>>> (there is a race condition in the above code -- do you see it? if >>>> concurrent access to this function is an issue, you have to LOCK an >>>> object before running the PERFORM or perhaps use an advisory lock). >>> >>> Is there? You'd think that with transactional DDL and the code running >>> in a single transaction (namely inside a stored function) it would be >>> concurrency-safe. >> >> Transactional DDL does not protect you from race conditions any more >> than MVCC protects you from race conditions in regular DML. What >> transactional DDL does is roll back the changes in the event of an >> error so you don't have half written schema changes in your database. >> MVCC gives a rigorous definition of visibility rules and transactions >> guarantee only a complete unit of work getting committed to the >> database. You still have to code defensively against multi-user >> access however. The good news is that multi user coding is about an >> order of magnitude easier in sql (especially postgres variant) than in >> any other development platform that I'm aware of. >> >> The race condition here is basically the same problem that affects >> 'upsert' patterns: >> >> test record(s) if found update if not found insert; >> >> The problem comes that in between the test and the insert case someone >> else can also test and get the insert in before you do. You have two >> general strategies to get around this: locking and retry. I greatly >> advise going the locking route unless your concurrency requirements >> are very high. It's much simpler, and since you're not invoking a >> subtransaction, faster in the uncontested case. > > So what would happen if you don't lock? I think it's this: > > Session A | Session B > ----------------+----------------- > SELECT x | SELECT x > NOT FOUND | NOT FOUND > ... | CREATE TABLE > CREATE TABLE | <commit> > <error> | > <rollback> | > > If I understand correctly, if you don't mind the error and the > subsequent rollback in Session A, than there's not much need to lock, > or is there? It is important to be aware of the possible rollback of > such a transaction, of course. right -- allowing for rollback and retrying (either from the client or in the procedure via sub-transaction) is always an option. I prefer to lock -- it's faster (usually) and gives more regular behavior. > And what would you lock? A record in information_schema.tables? That's > a read-only view. A table that doesn't exist yet? Can't do. A record > in the pg_ schema? Rather not... > I suppose you could work around that problem by keeping track of your > own tables that were generated using aforementioned plpgsql function. > Then you have a table (that you own) with records to lock. yeah -- you could use an advisory lock or a special table created for that purpose, or a row of a table that does your tracking. Agree that locking system catalogs is *not* advisable. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general