Re: Allow disabling folding of unquoted identifiers to lowercase

John McKown <john.archie.mckown@xxxxxxxxx> · Fri, 29 Apr 2016 16:13:33 -0500

On Fri, Apr 29, 2016 at 3:38 PM, John R Pierce <pierce@xxxxxxxxxxxx> wrote:

    On 4/29/2016 12:56 PM, John McKown
      wrote:

          I suspect this would be painful for the parser, unless you
            also enforced that all SQL keywords were in a specific case
            (all lower would be the minimal impact to the code).  
            otherwise the parser would have to lower() every token to
            check to see if its a keyword, but if not, revert it to its
            original case.

          Why? PostgreSQL is written in C. So use strncasecmp() instead
          of strncmp() or strcasecmp() instead of strcmp() to test for a
          token.

    are those the APIs the parser uses?

Did a quick check of the .c files in the backend/parser directory source code. All the programs use the, oldest, strcmp() function. Too bad. I'm not a C expert, nor extremely familiar with the PostgreSQL source code. But doing some scans, I see the use of a function called "ScanKeywordLookup" and it does a case insensitive search "the hard way". The comments indicate that this is due the the SQL standard requiring proper use of Unicode translation:
comment:
/*
         * Apply an ASCII-only downcasing.  We must not use tolower() since it may
         * produce the wrong translation in some locales (eg, Turkish).
         */

Oh well, it's been interesting, but I don't think that we'll come to a resolution for the OP on this issue. I just blame both PostgreSQL and MySQL for this problem because the SQL standard says that the names are automatically UPPERCASEd unless enclosed in quotes. Not lower cased as PostgreSQL does it, nor unchanged as MySQL does it.

-- 
The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Maranatha! <><
John McKown