Martijn van Oosterhout wrote:
On Sat, Oct 15, 2005 at 05:46:03PM -0700, Chris Travers wrote:
This is largely what I have been suggesting. However, you still have
two (small) problems that could be solved fairly easily I think. Take
Tom's example:
SELECT MAX(relpages) FROM pg_class;
Now, I am perfectly happy to require relpages and pg_class to be quoted
(if you don't want to, there are always views....), however, MAX is also
an identifier, which if I understand Tom correctly is also folded to
lower case. If you fold MAX to upper case, you may not find the
aggregate. I don't see an easy answer aside from having separate
entries for "max" and "MAX" which are functionally the same.
I think you're certainly waving away a lot of potential issues with saying
that "as long as system identifiers are quoted we're fine". Tom's
example is just the tip of the iceberg.
- consider functions like nextval/currval/substring. Some of these are
created and used internally too, should they always be quoted? Would
these functions all appear twice in the functions list?
I think I mentioned that the case issue is of particular importance for
functions and aggregates because we don't want to type:
SELECT "max"("relpages") FROM "pg_class" and this is definitely not
standard anyway.
I think any functions that are shipped with PostgreSQL should come in
both an upper case and lower case form. I am willing to commit to
building these in the system catalogs. Same with basic types (like
"INT" and "int") for example. Or another option would just be to lower
case all reserved SQL words though this would lead to a wide range of
possible issues.
- consider oid,xmin,xmax,etc are they system identifiers? You can't
provide two copies of those, since each column can only appear once in
the catalogs.
No you can't but if you need to refer to oid, xmin, and xmax from
outside the backend, you can always quote them. Besides if you really
nead that level of compatibility, consider the following....
CREATE VIEW "PG_CLASS" AS
SELECT "oid" AS "OID", "relname" as "RELNAME", .....
FROM "pg_class";
and then create associated rules. So yes, we can have two entries in
the system catalogs if we need to because views are wonderful things
:-) The only real exception is with functions and agregates, and these
are bigger issues.
BTW, not to split hairs here too much, but for those customers that need
this now, I suppose one could always use this same trick.
CREATE SCHEMA "PUBLIC";
CREATE VIEW "PUBLIC"."TABLE1" AS
SELECT "field1" AS "FIELD1"....
FROM "public"."table1";
CREATE RULE irule AS ON INSERT INTO "TABLE1"
DO INSTEAD INSERT INTO "public"."table1" ("field1"...) VALUES
(NEW."FIELD1"...);
etc....
Could be a lot of work, but it should be possible to write a Perl script
to generate these for you :-)
Sounds like a project for me :-P
- consider pg_dump, psql and other such tools with intimate knowledge
of the catalogs, they would all need to updated.
If you need to create backward compatibility via views, my proposal
would allow for that. Also if this is a GUC variable, there is no
reason it can't be set by the application where needed. Also, as far as
I am concerned, I would rather leave the system catalogs alone and
change them as *little* as possible for these reasons.
- consider the number of third party apps that uses these "system
identifiers". Updating all of those is a massive task, for what gain?
See above regarding views. The only difference between shipping it with
this mode enabled is that it provides an easy means for customers to
avoid the problem and we have a centralized solution.
There is some sympathy around here for the view that in this case the
standard is simply wrong. As said on this thread already, uppercase
only was appropriate for twenty years ago but not now. Older
programming languages tended to have their keywords in uppercase, yet
no modern language would consider such an act.
So, I think in the long run there is going to be some sympathy for
helping people with "QUOTEDUPPERCASE" identifiers to port their stuff
but I don't think PostgreSQL itself will ever default to folding up.
Who ever said anything about defaulting. I simply said an option should
be present. FWIW I prefer the way PostgreSQL does it, but I have to
recognize that it will bite people migrating from RDBMS's that do follow
the standards.
Hence the discussion about folding quoted identifiers also, which
solves the issue without masses of changes everywhere.
Curiously, MySQL does as we do, although they don't accept standard
quoting at all (using backquotes). Do people complain about them too?
MySQL's unquoted table names being case sensitive is even more broken
than ours ;-) Besides if people using MySQL cared about standards, I
don't think they would still be using MySQL...
It seems to me that the first step in any solution is going to be
creating a tool that automatically generates views in appropriate schema
using upper case (or maybe optionally lower case) table and column
names. This should be fairly simple to do. Then such a tool could be
used to provide whatever sort of compatibility we need. Indeed this may
in fact be more flexible in helping people migrate to
BrokenSystemsLikeMySQL because it would allow you to create views off
tables using lower case names as well.
A second step will be to ensure that the backend really does treat
identifiers as already double-quoted.
However, this is only a partial fix. It doesn't solve the larger issue
of (albeit badly written but standards-compliant) queries like:
SELECT "MYCOLUMN" FROM MYTABLE;
My fundamental point is that the above statement means something
specific according to the SQL standard, and it means something
*different* in PostgreSQL. The time may not be ripe for a solution but
if it is at least on the TODO list, then maybe people will be less bit
by it if and when we actually do support this behavior.
Best Wishes,
Chris Travers
Metatron Technology Consulting
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match