On 5/22/21 5:52 PM, Bryn Llewellyn wrote:
Tom, David, Adrian, and Peter—thank you all very much for taking an
interest in my questions. Your responses have, collectively, been an
enormous help. I deleted the text of the exchanges in this particular
branch of the overall discussion because it's become rather difficult to
work out who said what in response to what. It's all there in the
"pgsql-general" email archive.
Here's my summary (in no particular order) of what I've learned. Please
feel free to ignore it.
----------------------------------------------------------------------
(1) In the context of discussing timezones, the English word
"abbreviation" maps to two distinct notions in PG:
"pg_timezone_names.abbrev"; and "pg_timezone_abbrevs.abbrev". Tom
earlier said « there are two views [because there are] two sources of
truth involved ». This really means that these two columns list
different classes of facts. It's therefore unfortunate that they both
have the same name.
They are reporting the same facts, just in different context.
pg_timezone_abbrevs.abbrev is a list of un-duplicated, as I found out,
abbreviations. pg_timezone_names.abbrev shows the abbreviations or
offsets in effect at CURRENT_TIMESTAMP.
----------------------------------------------------------------------
(2) It's no wonder, therefore, that I was confused. Anybody with even a
slight exposure to relational design would guess
that "pg_timezone_names.abbrev" is a FK to the PK in
"pg_timezone_abbrevs.abbrev". And they might wonder why "utc_offset" and
"is_dst" seem to be denormalized. But they'd probably recall that such
things are common, as a usability convenience, in views. Anyway, I'm
beyond that confusion now.
Since the views are based on functions that are displaying different
context I don't find it surprising.
----------------------------------------------------------------------
(3) It helps me to think of "pg_timezone_names.abbrev" as
"nickname"—which notion is unique, just for each name. It's useful only
informally, and only when a particular timezone observes DST, as a
shorthand for disambiguation. I used the example earlier:
The fact that Asia/Manila happens not to observe DST makes my example a
little less powerful. Never mind, their politicians could decide to
introduce it presently and to give DST the nickname BST (or DOG). It's
remarkable, in itself, that "ST" in "BST" means "Summer Time" but that
it means "Standard Time" in "PST". But this nicely unerlines the point
that there's no rhyme or reason in the design of these abbreviations.
Notably, the "nickname" in "pg_timezone_names" has nothing at all
(formally) to do with "abbrev" in "pg_timezone_abbrevs".
Yes it does. It just there is not always a one-to-one correspondence
between the two. For instance duplicate abbreviations do exist but they
are filtered out of pg_timezone_abbrevs.
----------------------------------------------------------------------
(4) My realization, as set out in #3, helps me now to understand the rule:
« PostgreSQL allows you to specify time zones in three different
forms... A time zone abbreviation, for example PST [but ONLY] in
date/time input values and with the AT TIME ZONE operator… Such a
specification merely defines a particular offset from UTC... The
recognized abbreviations are listed in the pg_timezone_abbrevs view »
This (to my embarrassment, now) does say that a "nickname" from
"pg_timezone_names" is not allowed as the argument for the AT TIME ZONE
operator. But it doesn't adumbrate that the English word "abbreviation"
means what I now understand it to mean. You need to get the point from
one terse sentence without the luxury of examples to strengthen the
explanation.
silently succeeds. Sure enough, the doc does say « PostgreSQL will
accept POSIX-style time zone specifications, as described in Section
B.5. [as the argument for AT TIME ZONE] » But what a quagmire of
confusion this is. I said elsewhere that the same aim (get the local
time at -42 hours w.r.t. UTC) can be met with transparent syntax, thus:
*select ('2021-05-22 15:00:00 America/Los_Angeles'::timestamptz at time
zone 'UTC') - '42:00'::interval;*
I know which syntax I prefer!
That is the root of the above, preference. There are many ways to
express time zones and as a general purpose database Postgres needs to
deal with all of them.
----------------------------------------------------------------------
(5) I was embarrassingly slow to find this in the doc for the SET command:
« Timezone settings given as numbers or intervals are internally
translated to POSIX timezone syntax. For example, after SET TIME ZONE
-7, SHOW TIME ZONE would report <-07>+07. »
(David pointed this out.) This is confusing in itself. And it's
compounded by this fact: POSIX uses positive offsets to denote what both
the utc_offset column and to_char(<some timestamptz value>), with the
format element TH:TM, shows as negative. But it is what it is.
See my earlier post about trying to find logic in this.
----------------------------------------------------------------------
(6) To the extent that a catalog view can have a business unique key
(and I believe that this must be the case for it to be useful), the
unique key for "pg_timezone_names" is "name" and the unique key for
"pg_timezone_abbrevs" is "abbrev". I'm going to assume that PG has code
to maintain these invariants. This gives me the way to interpret the
statement "an [abbreviation] specification merely defines a particular
offset from UTC".
«
The @OVERRIDE syntax indicates that subsequent entries in the file can
override previous entries (typically, entries obtained from included
files). Without this, conflicting definitions of the same timezone
abbreviation are considered an error.
»
So even without "timezone_abbreviations" set to "default" at session
level, "pg_timezone_abbrevs.abbrev" will always be unique.
True.
----------------------------------------------------------------------
(7) There are two different syntaxes for setting the timezone session
environment variable. This (or with TO instead of =):
*set timezone =*
and this:
*set time zone*
This, too, is hugely confusing. (Correspondingly, there's both "show
timezone" and "show time zone".)
The first "set" (and "show") alternative is consistent with how you set
all other session environment variables. Moreover, all are observable
with the current_setting() built-in function. And there the argument
"timezone" can only be spelled as one word. This makes me favor the "set
timezone" spelling.
However, only the "set time zone" spelling allows an argument that's an
explicit interval value like this:
*set time zone interval '-08:00';*
I said "explicit interval value" because this isn't true. For example,
these two fail:
*set time zone '-08:00'::interval;*
and
*set time zone make_interval(hours=>-8);*
both fail, even though each uses a legal interval value. This is
confusion on top of confusion. All the more reason to avoid it.
It is spelled out here:
https://www.postgresql.org/docs/12/sql-set.html
Basically set time zone is for customizing the timezone value, whereas
set timezone is for using 'standard' timezone values. They both end up
setting the same value.
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx