Re: Questions on Tags table schema

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jay Kang wrote:
Hey Richard,

Sorry for the late reply, I was just making my first test version of the DB
closely resembling you suggested design. Just wanted to write you back
answering your questions. So here we go:

No problem - it's email and what with different timezones it's common to have gaps.

What is AddedBy - a name, a user-id?
If it's an ID, then it seems very long.
If it's a name, then 256 characters sounds a bit arbitrary as a length.
Why choose 256?

No, AddedBy is the username of the individual. Why is 256 characters
arbitrary as a length? Would 255 be better or 32? I guess, your saying its
too long, shorten it, I'm just going with what the books says, but I really
welcome any comments you have^^

The book doesn't know what you're trying to do. You do. The important thing is not whether you choose 256 or 32 or 100, it's that you've thought about it first.

Obvious thoughts:
1. Is this going to be a real name "Richard Huxton" or an identifier "rhuxton123"? You'll want more characters for the real name than an identifier. 2. Where will this be displayed and will it take up too much space? If I pick a username of WWW...250 repeats...WWW does that mess up any formatting? 3. Do we allow HTML-unsafe characters ('<', '&') and escape them when used, or just not allow them?

No wrong or right, the process of thinking about it is important.

The advantage of *not* having AddedBy as a foreign-key is that you can
delete users and not have to update tags with their user-id. The
disadvantage is the same thing. You can end up with tags added by
non-existent users.

Thanks, I would like anonymous users to be able to add tags, so I guess I'll
leave it the way it is^^

You would normally use NULL to indicate "unknown", which in the case of an anonymous user would be true. A NULL foreign-key is allowed (unless you define the column not-null of course).

[snip]
Well, it all depends on your use analysis. You could make a good
argument that there are two sets of fact data:
1. "Identity" data - id, name, added-by, added-ts, status
2. "Activity" data - number of views, last time clicked etc

If I were to create Identity and Activity for the Tags table, would I be
creating two separate tables called TagActivities and TagIdentities?

That's what I'm talking about, and you'd have a foreign-key constraint to make sure activity refers to a real tag identity. Again, I'm not saying you *do* want to do this, just that you'll need to think about it.

Currently, I'm not sure how I'll analysis the data for Tags. I know that I
want to do the bigger font if it is popular and smaller font if its not.
Hmm, would like to see examples of other websites that utilized Tags tables
to see how they implemented this function. I was thinking of adding tagcount
(popularity) for each user within the user definition table.

For this particular case, you'll almost certainly want to cache the results anyway. The popularity isn't going to change that fast, and presumably you'll only want to categorise them as VERY BIG, Big, normal etc. I assume asp.net allows you to cache this sort of information somehow.

You have cars which have tags and planes which have tags. Tagging a
plane is not the same as tagging a car. Either you confuse that issue,
or you want separate tables to track each relationship.
Hmm, so if I have a tag called "Saab" and a user clicks on Saab, then
information from both Cars and Planes table would appear.
Well, if you UNION them, yes. Of course you'll need to find columns that
make sense across all types. If planes have e.g. "wingspan" then you'll
need to add 'not applicable'::text in the car-related subquery.

Hmm, currently I can't visualize the query, again, it would help if I can
see the data to see what you mean. If planes table had a tag called
wingspan, wouldn't the query just not show any value for the field so it
wouldn't need 'not applicable' in the car-related subquery? Not sure really.

Sorry - I'm trying to say that if you UNION together several queries they all need to have the same columns. So - if one subquery doesn't have that column you'll need to provide a "not applicable" value instead.

Best of luck with the application, and don't forget to cache query results when they don't change often. It'll boost performance quite a bit.

P.S. - try the "general" mailing list if you want to discuss this sort of thing some more. This one is really supposed to be performance-related questions only.

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux