Rui DeSousa <rui@xxxxxxxxxxxxx> writes: >> On Apr 28, 2020, at 10:29 PM, raf <raf@xxxxxxx> wrote: >> >> Rui DeSousa wrote: >> >>>> On Apr 28, 2020, at 7:43 PM, raf <raf@xxxxxxx> wrote: >>>> >>>> I just use "text" for everything. It's less typing. :-) >>> >>> Ugh, I see it as sign that the designers of the schema didn’t fully >>> think about the actual requirements or care about them and it usually >>> shows. >> >> You are mistaken. I care a lot. That's why I >> future-proof designs whenever possible by >> not imposing arbitrarily chosen limits that >> appear to suit current requirements. >> >> In other words, I know I'm not smart enough >> to predict the future so I don't let that >> fact ruin my software. :-) >> >> cheers, >> raf >> > > Arbitrarily? What’s a cusip, vin, ssn? Why would you put a btree index on a text field? Because it’s not. > > What you’re advocating is a NoSQL design — defer your schema design. Letting the application code littered in multiple places elsewhere define what a cusip, etc. is. I think the key term in this thread is 'arbitrary'. When implementing a schema design, it should reflect the known constraints inherent in the model, but it should avoid imposing arbitrary constraints if none exist or cannot be determined. So, if you know that a customer ID field has a current limitation of 50 characters, then use a definition which reflects that. It may be that at some point in the future, this will be increased, but then again, it may not and that bit of information provides useful information for application developers and helps with consistency across APIs. Without some guideline, different developers will impose different values, leading to maintenance issues and bugs down the track. On the other hand, imposing an arbitrary limitation, based on little more than a guess by the designer, can cause enormous problems. As an example, I was working on an identity management system where there was a constraint of 8 characters on the username and password. This was an arbitrary limit based on what was common practice, but was not a limitation imposed by any of the systems the IAM system interacted with. It was recognised that both fields were too small and needed to be increased. The easy solution would have been to make these fields text. However, that would cause a problem with some of the systems we needed to integrate with because either they had a limit on username size or they had a limit on password size. There were also multiple different APIs which needed to work with this system and when we performed analysis, they had varying limits on both fields. What we did was look at all the systems we had to integrate with and found the maximum supported username and password lengths for each system and set the fields to have the maximum length supported by the systems with the shortest lengths. Having that information in the database schema also informed those developing other interfaces what the maximums were. It is quite likely these limits would be increased in the future and the database definition would need to be increased - in fact, some years after going into production, exactly this occurred with the password field when a different encryption algorithm was adopted which did not have the previous character limitation and the client wanted to encourage users to use pass phrases rather than a word. The point is, just using text for all character fields loses information and results in your model and schema being less expressive. Providing this information is sometimes critical in ensuring limits are maintained and provides useful documentation about the model that developers can use. However, imposing limits based on little more than a guess is usually a bad idea and if you cannot find any reason to impose a limit, then don't. I disagree with approaches which claim using text everywhere is easier and future proofing. In reality, it is just pushing the problem out for someone else to deal with. The best way to future proof your application is to have a clear well defined data model that fits the domain and is well documented and reflected in your database schema. -- Tim Cross