Re: multiple UNIQUE indices for FK

Francisco Olarte <folarte@xxxxxxxxxxxxxx> · Fri, 4 Mar 2016 18:59:46 +0100

Hi Rafal:

These are my opinions, somebody else may think they are not correct,
comments are wellcome.

On Fri, Mar 4, 2016 at 2:30 PM, Rafal Pietrak <rafal@xxxxxxxxx> wrote:

>> This is normally a sympton of your schema not being appropiately normalized.
> OK. I'd apreciate some guidance here. I've sattled for a schema
> suggested in this thread a fiew posts before. In short went like this:
>
> CREATE TABLE persons(person_id primaty key, ...);
> CREATE TABLE msgs_person(msg_id, person_id references
> persons(person_id), rel_type, the_message_itself, primary
> key(message_id, person_id,rel_type),....);
>
> where:
> person_id - sender or recepient of the message
> msg_id - an ID uniquely assigned by sender
> rel_type - a role a row in msgs_person table is assigned to this
> particular relation: person+message; this role can be either SENDER or
> RECEPIENT

For this particular case ( person must be either a UNIQUE sender or a
multiple recipient ) my opinion will be:
0.- Table persons is OK.
1.- Move sender to msg.
CREATE TABLE messages (
    message_id primary key,
    sender_person_id NOT NULL  references persons,
    message_ts timestamp ( or other ordering imposing field, more on
this below).
    ....rest of fields.
)
2.- Make a recipients table:
CREATE TABLE recipients (
    message_id references messages,
    receipient_person_id references persons,
    primary key (message_id, person_id) -- to avoid duplicate senders.
)

Also, it will be useful to know if the sender can be a recipient to (
like with e-mail ).

Make sender_person_id NOT NULL in messages if you want to insure every
message ahs exactly ONE SENDER, leave it out if you want to allow
senderless messages. An FK column must either link to a record or be
null.

Then, if you want to have a msgs-person ''table'' I would use a view:

CREATE VIEW msgs_persons as
SELECT message_id, sender_person_id as person_id, 'SENDER' as role from messages
UNION ALL
SELECT message_id, recipient_person_id as person_id, 'RECIPIENT' as
role from recipients

> And now, apart from the above, I'm trying to put a NEXT field into the
> MSGS_PERSON table, so that I can (sort of):
> ALTER TABLE  msgs_person ADD CONSTRAINT next_fk FOREIGN KEY
> (next,person_id,rel_type) REFERENCES msgs_person(msg_id,person_id,
> rel_type); ...( just for: rel_type=SENDER).
> What should I do with this to have it "appropriately normalized"?

The normalization problem is more in the msgs-person table, as it
contains heterogeneous data, this is why I would use the recipients
table approach.

I do not know what you want to achieve with the NEXT. It seem you are
trying to build a linked list of sent messages. In my approach, if you
really need this, you just put the next field on messages, index it on
(sender_id+message_id) and make (sender_id+next_message_id) reference
that index.

But if you are just trying to know the last message for a sender, I
would put an ordering column, like timestamp ( my personal preference,
if message_id has a total ordering ( I suppose it's an integer or an
string ) would be to add message_ts, fill it with the message
sending/reception timestamp ( I do not know which kind of messages you
are storing, so not too sure aabout this, so lets say row insertion
timestamp ), use the message-id as a tie breaker ( in case you get
many messages with the same timestamp ) ( Or, if you have an external
ordered field, use that, more knowledge of the particular system is
needed ), index the table on message_ts ( or message_ts + message_id,
if lots of dupes ) to speed up queries and just do a query on messages
for a sender id, order by message_ts desc, message_id desc limit 1 for
the last message.

> "generic constraint's"? - pls elaborate regarding schema I've just
> schetched above.

You can check anything in a constraint on a table by using a function
to do things. Although, IIRC, there were some problems if you made
queries, and the thing was something better done with triggers. The
problem is difficult, and this is why when dessigning tables thinks
like linked list and similar are best avoided, as you enter a
difficult zone when you begin to consider visibility rules for the
data ( when many transactions, with different isolation levels, are
concurrently manipulating the same set of tuples). Your app may not do
this, but servers are dessigned to allow it.

> In that case, pls forgive my language - all I ment is that this
> "implementation detail", in postgresql is in fact "enforced policy" ...

Nothing to forgive, I was just trying to explain some things.

> but this isn't actually the problem here. I'm quite happy with a system
> that helps me avoid performence pitfalls.
> The problem I see is different - and the reason I'm asking about
> theoretical background of the implementation is different.
> The problem is that once one accepts the requirement for a unique index
> as FK target column "mandatory performance support", then I fail to see
> real reazon, where *ENY* unique index shouldn't do that role too. They
> are unique (within  domains of their conditions) and by definition yield
> a single row for FK (or nothing); that should be sufficient for the
> engine to keep data consistancy as expected, shouldn't it?

Maybe. I'm not following too much what you try to do here. Either you
are way above my level or you have a mental model of how postgres
should work which does not correspond to how it does.

> Naturally I undestand that there might be some deep reasons for
> exclusion of partial indexed as FK target "selectors" - I'd apreciate
> further explanations. But in case those reasons exist, I'd expect
> workarounds to exist too - like "SQL idioms" - that people normally use
> for cases like these. Would those "generic constraint" be be idiom?

Please, forget anything I said about generic constraint, my fault, I
was trying to express something, failed at it, and do not know how to
fix it.

What I personally do to avoid these kind of problems is to avoid
dessigning something which needs references to a partial index. And
I've been successful at it for a long time. I'm not going to recover
the complete thread to recap on why you exactly are doing these kind
of really advanced things, but I suppose you need them for some reason
and I do not have the resources to study it.

> So as I said before, I'm looking for some guidence here. (I'm really
> emotionally bond to that NEXT field there :)

ON this I cannot help you too much. I do not see what you are trying
to achieve with the NEXT field. These will need more explanations, and
more study, and as I said before, I do not have the available
resources for them.

Sorry for the tme I've taken, but I feel I can not be of any help here.

Best regards.

Francisco Olarte, over & out.

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general