Re: PostgreSQL CHARACTER VARYING vs CHARACTER VARYING (Length)

Rui DeSousa <rui@xxxxxxxxxxxxx> · Wed, 29 Apr 2020 01:52:07 -0400

> On Apr 29, 2020, at 1:30 AM, Tim Cross <theophilusx@xxxxxxxxx> wrote:
> 
> I think the key term in this thread is 'arbitrary'. When implementing a
> schema design, it should reflect the known constraints inherent in the
> model, but it should avoid imposing arbitrary constraints if none exist
> or cannot be determined.
> 
> So, if you know that a customer ID field has a current limitation of 50
> characters, then use a definition which reflects that. It may be that at
> some point in the future, this will be increased, but then again, it may
> not and that bit of information provides useful information for
> application developers and helps with consistency across APIs. Without
> some guideline, different developers will impose different values,
> leading to maintenance issues and bugs down the track.
> 
> On the other hand, imposing an arbitrary limitation, based on little
> more than a guess by the designer, can cause enormous problems. As an
> example, I was working on an identity management system where there was
> a constraint of 8 characters on the username and password. This was an
> arbitrary limit based on what was common practice, but was not a
> limitation imposed by any of the systems the IAM system interacted with.
> It was recognised that both fields were too small and needed to be
> increased. The easy solution would have been to make these fields text.
> However, that would cause a problem with some of the systems we needed
> to integrate with because either they had a limit on username size or
> they had a limit on password size. There were also multiple different
> APIs which needed to work with this system and when we performed
> analysis, they had varying limits on both fields.
> 
> What we did was look at all the systems we had to integrate with and
> found the maximum supported username and password lengths for each
> system and set the fields to have the maximum length supported by the
> systems with the shortest lengths. Having that information in the
> database schema also informed those developing other interfaces what the
> maximums were. It is quite likely these limits would be increased in the
> future and the database definition would need to be increased - in fact,
> some years after going into production, exactly this occurred with the
> password field when a different encryption algorithm was adopted which
> did not have the previous character limitation and the client wanted to
> encourage users to use pass phrases rather than a word. 
> 
> The point is, just using text for all character fields loses information
> and results in your model and schema being less expressive. Providing
> this information is sometimes critical in ensuring limits are maintained
> and provides useful documentation about the model that developers can
> use. However, imposing limits based on little more than a guess is
> usually a bad idea and if you cannot find any reason to impose a limit,
> then don't. I disagree with approaches which claim using text everywhere
> is easier and future proofing. In reality, it is just pushing the
> problem out for someone else to deal with. The best way to future proof
> your application is to have a clear well defined data model that fits
> the domain and is well documented and reflected in your database schema. 
> 
> 
> -- 
> Tim Cross
> 

I can’t agree more… Thanks Tim.