Search Postgresql Archives

Re: The contents of the pg_timezone_names view bring some surprises

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/22/21 5:52 PM, Bryn Llewellyn wrote:
Tom, David, Adrian, and Peter—thank you all very much for taking an interest in my questions. Your responses have, collectively, been an enormous help. I deleted the text of the exchanges in this particular branch of the overall discussion because it's become rather difficult to work out who said what in response to what. It's all there in the "pgsql-general" email archive.

Here's my summary (in no particular order) of what I've learned. Please feel free to ignore it.

----------------------------------------------------------------------

(1) In the context of discussing timezones, the English word "abbreviation" maps to two distinct notions in PG: "pg_timezone_names.abbrev"; and "pg_timezone_abbrevs.abbrev". Tom earlier said « there are two views [because there are] two sources of truth involved ». This really means that these two columns list different classes of facts. It's therefore unfortunate that they both have the same name.

They are reporting the same facts, just in different context. pg_timezone_abbrevs.abbrev is a list of un-duplicated, as I found out, abbreviations. pg_timezone_names.abbrev shows the abbreviations or offsets in effect at CURRENT_TIMESTAMP.


----------------------------------------------------------------------

(2) It's no wonder, therefore, that I was confused. Anybody with even a slight exposure to relational design would guess that "pg_timezone_names.abbrev" is a FK to the PK in "pg_timezone_abbrevs.abbrev". And they might wonder why "utc_offset" and "is_dst" seem to be denormalized. But they'd probably recall that such things are common, as a usability convenience, in views. Anyway, I'm beyond that confusion now.

Since the views are based on functions that are displaying different context I don't find it surprising.


----------------------------------------------------------------------

(3) It helps me to think of "pg_timezone_names.abbrev" as "nickname"—which notion is unique, just for each name. It's useful only informally, and only when a particular timezone observes DST, as a shorthand for disambiguation. I used the example earlier:



The fact that Asia/Manila happens not to observe DST makes my example a little less powerful. Never mind, their politicians could decide to introduce it presently and to give DST the nickname BST (or DOG). It's remarkable, in itself, that "ST" in "BST" means "Summer Time" but that it means "Standard Time" in "PST". But this nicely unerlines the point that there's no rhyme or reason in the design of these abbreviations.

Notably, the "nickname" in "pg_timezone_names" has nothing at all (formally) to do with "abbrev" in "pg_timezone_abbrevs".

Yes it does. It just there is not always a one-to-one correspondence between the two. For instance duplicate abbreviations do exist but they are filtered out of pg_timezone_abbrevs.


----------------------------------------------------------------------

(4) My realization, as set out in #3, helps me now to understand the rule:

« PostgreSQL allows you to specify time zones in three different forms... A time zone abbreviation, for example PST [but ONLY] in date/time input values and with the AT TIME ZONE operator… Such a specification merely defines a particular offset from UTC... The recognized abbreviations are listed in the pg_timezone_abbrevs view »

This (to my embarrassment, now) does say that a "nickname" from "pg_timezone_names" is not allowed as the argument for the AT TIME ZONE operator. But it doesn't adumbrate that the English word "abbreviation" means what I now understand it to mean. You need to get the point from one terse sentence without the luxury of examples to strengthen the explanation.


silently succeeds. Sure enough, the doc does say « PostgreSQL will accept POSIX-style time zone specifications, as described in Section B.5. [as the argument for AT TIME ZONE] » But what a quagmire of confusion this is. I said elsewhere that the same aim (get the local time at -42 hours w.r.t. UTC) can be met with transparent syntax, thus:

*select ('2021-05-22 15:00:00 America/Los_Angeles'::timestamptz at time zone 'UTC') - '42:00'::interval;*

I know which syntax I prefer!

That is the root of the above, preference. There are many ways to express time zones and as a general purpose database Postgres needs to deal with all of them.


----------------------------------------------------------------------

(5) I was embarrassingly slow to find this in the doc for the SET command:

« Timezone settings given as numbers or intervals are internally translated to POSIX timezone syntax. For example, after SET TIME ZONE -7, SHOW TIME ZONE would report <-07>+07. »


(David pointed this out.) This is confusing in itself. And it's compounded by this fact: POSIX uses positive offsets to denote what both the utc_offset column and to_char(<some timestamptz value>), with the format element TH:TM, shows as negative. But it is what it is.

See my earlier post about trying to find logic in this.


----------------------------------------------------------------------

(6) To the extent that a catalog view can have a business unique key (and I believe that this must be the case for it to be useful), the unique key for "pg_timezone_names" is "name" and the unique key for "pg_timezone_abbrevs" is "abbrev". I'm going to assume that PG has code to maintain these invariants. This gives me the way to interpret the statement "an [abbreviation] specification merely defines a particular offset from UTC".


«
The @OVERRIDE syntax indicates that subsequent entries in the file can override previous entries (typically, entries obtained from included files). Without this, conflicting definitions of the same timezone abbreviation are considered an error.
»

So even without "timezone_abbreviations" set to "default" at session level, "pg_timezone_abbrevs.abbrev" will always be unique.

True.


----------------------------------------------------------------------

(7) There are two different syntaxes for setting the timezone session environment variable. This (or with TO instead of =):

*set timezone =*

and this:

*set time zone*

This, too, is hugely confusing. (Correspondingly, there's both "show timezone" and "show time zone".)

The first "set" (and "show") alternative is consistent with how you set all other session environment variables. Moreover, all are observable with the current_setting() built-in function. And there the argument "timezone" can only be spelled as one word. This makes me favor the "set timezone" spelling.

However, only the "set time zone" spelling allows an argument that's an explicit interval value like this:

*set time zone interval '-08:00';*

I said "explicit interval value" because this isn't true. For example, these two fail:

*set time zone '-08:00'::interval;*


and

*set time zone make_interval(hours=>-8);*

both fail, even though each uses a legal interval value. This is confusion on top of confusion. All the more reason to avoid it.

It is spelled out here:

https://www.postgresql.org/docs/12/sql-set.html

Basically set time zone is for customizing the timezone value, whereas set timezone is for using 'standard' timezone values. They both end up setting the same value.








--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux