On Thu, 23 Apr 2009 12:00:30 +0200 Karsten Hilbert <Karsten.Hilbert@xxxxxxx> wrote: > On Thu, Apr 23, 2009 at 11:33:34AM +0200, Ivan Sergio Borgonovo > wrote: > > > Karsten Hilbert <Karsten.Hilbert@xxxxxxx> wrote: > > > > > On Thu, Apr 23, 2009 at 12:02:13AM +0100, Seref Arikan wrote: > > > > > > > I have a set of dynamically composed objects represented in > > > > Java, with string values for various attributes, which have > > > > variable length. In case you have suggestions for a better > > > > type for this case, it would be my pleasure to hear about > > > > them. > > > > > > Seref, he's suggesting you use TEXT instead of > > > VARCHAR(something). In PG it's actually usually *less* > > > overhead to use the unbounded text datatype (no length check > > > required). > > > > > > Length checks mandated by business logic can be added by > > > more dynamic means -- check constraints, triggers, etc which > > > allow for less invasive change if needed. > > > > Could you point us to some example of a constraint/trigger > > (etc...) that is going to provide the same checking of varchar > > and explain (if the code/example... doesn't make it clear) why > > it should be faster or less invasive? > > check constraint based: > > create table foo ( > txt text > check (char_length(txt) < 1001) > ); > > trigger based: > > create function trf_check_length_1000() ... returns > trigger ... $$...$$; > > create table foo ( > txt text > ); > > create trigger check_txt_length before INSERT or > UPDATE ... execute trf_check_length_1000(); > > faster: > > - TEXT is (judging by previous comments on this list) > marginally faster than VARCHAR(1000) because a) it runs > the same code but b) doesn't have to check for the 1000 > length > > - other options (VARCHAR, constraint, trigger) incur > additional overhead and are thus slower > > less invasive: > > Well, poor wording on my part, perhaps. What I meant is that > changing a check constraint or trigger appears to be a less > costly operation on a table than changing the datatype of a > column (although I seem to remember there being some > optimizations in place for the case of changing the *length* > of a varchar). I'll try to rephrase to check if I understood and for reference. varchar is slower than text since it has to do some "data type check". text is faster but if you add a check... it gets slower (slower than varchar?, faster?). constraint and trigger should have the advantage that in case of refactoring you're not going to touch the table definition that *may* end in being faster. But... if in postgresql implementation varchar is just text with a check... how can a change in type be faster? If it was a char(N) maybe there would be some kind of optimization since the length of the data is known in advance... so shrinking/enlarging a char(N) may have a different cost than shrinking a varchar(N) that in pg *should* have the same implementation than text. On the other end... you're preferring text just because they have the same implementation (modulo check) in Postgresql... but it may not be so in other DB. So *maybe* other DB do some optimization on varchar vs. text. Somehow I like the idea of considering a varchar a text with a check, but I think I prefer the "more traditional" approach since somehow is the "most expected". Nothing can handle strings of infinite length, and much before reaching infinite I'll get in trouble. People read differently what you'd like to say writing varchar(N). Most people read: 1) we expect a length around N Fewer people read: 2) There is something not working if we get something larger than N But it may also mean: 3) if we get something larger than N something is going to explode I think the same "ambiguity" is carried by check(). Anyway for a sufficiently large N 2) and 3) can be valid. Supposing the cost of loosing an insert for an unpredicted large value of N is high I'd be tempted to set N to at least protect me from 3) but I bet people may interpret it as 1). In my experience anyway varchar is a good early warning for troubles and the risk of being misunderstood/get caught by implementation dependent gotcha writing varchar(N) where N mean 3) largely encompass the risk of loosing an insert you didn't have to lose. Maybe I've spotted a potential advantage of check over varchar. If you use some kind of convention to name checks you could remove/re-apply them easier than spotting varchars(). The name of the constraint may contain metadata to help you. The name of the constraint may also suggest why it's there to your colleagues. But this works just if your implementation perform similarly on text over varchar(). -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general