On Mon, Jan 26, 2009 at 2:53 PM, Phoenix Kiula <phoenix.kiula@xxxxxxxxx> wrote: > On Mon, Jan 26, 2009 at 9:45 PM, Matthias Karlsson <matthias@xxxxxxx> wrote: >> On Mon, Jan 26, 2009 at 2:09 PM, Phoenix Kiula <phoenix.kiula@xxxxxxxxx> wrote: >>> I wonder if this is an SQL limitation or something I'm missing in the >>> PG manual, but I need to run an update on my database (to replace the >>> value of a column to match a new design structure). >>> >>> Due to the new business logic, the replaced value of a field may end >>> up being already present in the database in another record. This leads >>> to unique key violations when I run the update. >>> >>> My question: I don't mind if the update transaction skips the records >>> where the key would be violated (this preservation is in fact what we >>> want) but these are only about 2% of the overall updatable records. >>> >>> Is there anyway to make the transaction go through with the remaining >>> 98% of the update SQL which will in fact NOT violate the unique >>> constraint? >>> >>> -- >>> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-general >>> >> >> You could always extend your update statement to include an additional >> check to see if there are already rows present with the same value in >> the field you are talking about. >> >> // Matthias >> > > > > Thanks Matthias, but this seems a little recursive to me and I don't > know how to do the SQL. > > Here is my SQL thus far. The table is "testimonials". Basically the > column "user_alias" needs to be replaced to delete any mention of a > user's "api_key". Both of these are fields in the same table, hence > the replace logic below. > > As you will also see, based on our business logic, I have already kept > all the related IDs in a separate small table called > "testimonials_temp". This should speed up the process quite a bit > because instead of going through 5 million IDs, we just loop through > around 400,000. > > > > update testimonials > set user_alias = replace(user_alias, '-'||api_key, '') > where > id in (select id from testimonials_temp) > ; > > > > The problem is that after being replaced like that the "user_alias" > column has a problem, because some user_aliases already exist. How > should I add a check condition recursively? I tried this: > > > update testimonials > set user_alias = replace(user_alias, '-'||api_key, '') > where > id in (select id from testimonials_temp) > and replace(user_alias, '-'||api_key, '') not in (select user_alias > from links where user_alias = ?????)) > ; > > > > Hope I have explained this clearly. Would appreciate any ideas! > My idea was very similar to the SQL at the end of your post. Wouldn't something like this work? update testimonials u set u.user_alias = replace(u.user_alias, '-'||api_key, '') where u.id in (select id from testimonials_temp) and not exists (select id testimonials where user_alias = replace(u.user_alias, '-'||api_key, ''))) ? Not sure if this exact SQL is correct, but in your not in expression, you just need to make sure to refer to the user_alias of the current row being updated. // Matthias -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general