Hi Rafal: On Mon, Jul 20, 2015 at 3:33 PM, Rafal Pietrak <rafal@xxxxxxxxx> wrote: > 3. there are methods (like cryptographic "random" sequence), which > guarantee no conflicts. So one should resort to that. > Regarding the last point. Usually, I implement one-time used vouchers as > rows in table like: > CREATE TABLE (voucher int not null, consumed bool, expire timestamp not > null default timestamp_pl_interval(now()::timestamp, '2 > min'::interval),..., unique (voucher,consumed) ); > with CONSUMED column NULLyfied when voucher is used. The entire row of > consumed voucher is purged after clearence and verification, which > happen significantly later. Random as a primary key is considered a bad practice by many people with much experience, nullyfing it too. Many people even frown on just changing the primary key ( and one of the reasons for using serial as keys in many situations is to have a guaranteed not null unchanging value ). > Such short lived (when active) voucher is usually just 6-digit long, to > help people enter it. Then, random and with a narrow value domain, they make, IMNSHO, a really bad choice for primery keys. > I don't know much about cryptography, but would a generic encryption > function (like that indicated by Daniel) have the same "waking through > the entire range-space" behavior as the original when that range-space > is externally (by my application) truncated to those 6 digits? If not, > would it be as efficient in conflict avoidance as used with original > 32-bit range-space? An encryption function never has collisions ( do not confuse with a hash ). If it had you would be unable to decrypt it. The problem is the value domain for you. i.e., for your example you could choose a bit stream cipher applied to a 20 bit value. This is a moderately complex prolem to find or build ( from the classic cryptographic primitives nearly every language includes ). This will map every different 20 bit input value to a different 20 bit output value, so your value domain will be 20 bit numbers, your inputs will be the 10^6 6 digit numbers and the outputs will be 10^6 DIFFERENT 20bit numbers, of wich you could expect about 4.8% of them ( 2^20-10^6)/10^6 to have 7 digits ( with a leading one in this case ). To solve that problem you could use 19 digit input/output numbers or try to fin a decimal cypher which uses exactly 10^6 input digits. If you use a 32 bit block cypher it will not have collisions, but if you TRUNCATE the 32 bit ~ 9.5 digits output to 6 digits, you are no longer encrypting. You may call it hashing or whatever, but that is NOTt encryption, you would have collisions. > Then again. Is it really a "good practice" to rely on a programmer to > peek "proper/correct encryption helper" instead of providing him/her > with a database-integrated tool for a "well defined" and not so rare > usage scenario as "random default" for UNIQUE/PK column? Many of us are too old to get caught by this. This question is like asking "Is it good practice to hit a person with a 10 pound hammer in the head instead of giving a cookie?". There are other options. IMO NOT modifying a very complex chunk of code ( the server code doing the inserts and checking the collision cases and acting on them, plus the parser for insert queries plus .... ) and risking all the bugs it may introduce to help with inserting random pk is good practice. It doesn't matter if the requesting programmer peeks a bad encryption methods, keeps his old code for inserting random ids or introduces bugs in his program, the potential harm to the rest of the users is too great. > So my conclusion from this thread is that as this usage scenario does > not seem to be foreseen by current implementation of ON CONFLICT > transaction, a workaround exists (like: cryptographic range-walker). > Being it a workaround, I'd vote for some direct supported of that > scenario in the future at database level. Bear in mind your problem is totally ill defined. I mean, you want to insert a random 6 digits ID, and want the database to keep trying until it finds an unique one. What should it do if there already are 10^6 records in the db? Stall until a free one is found? abort? This kind of uses is very narrow, and very difficult to get right , and normally confined to the application domain. Even if you choose a totally correct encryption function for collision avoidance, like identity, you are going to have problems in your scheme. You are not voting for anything, you need a feature proposal to vote upon. So far the only one I could extract from this thread is "something which magically solves the current Rafal problem". I would vote against that. Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general