Thanks for the review. I've attached a proposed patch that attempts to
address the issues you raised. Unless otherwise noted below, I agreed
with your comments and incorporated them into the patch. The attached
patch assumes the patches I proposed earlier in
<https://www.ietf.org/mail-archive/web/tzdist-bis/current/msg00339.html>
and in
<https://www.ietf.org/mail-archive/web/tzdist-bis/current/msg00345.html>.
Certain
passages suggest that processors are expected to *not* examine the
version number in a file,
Readers can examine the version number (which some do), or they can
charge ahead and assume the version is good enough (some do that too).
If a passage suggests otherwise let's fix it (I briefly looked for such
passages and didn't find any).
silent acceptance of trailing
garbage is not specified, and this strategy is different from most
IETF standards.
It's the strategy used in practice for this format. For example, the
reference implementation (tzcode) does it that way. The attached
proposed patch proposes adding text to section 3 paragraph 2 to try to
document this better.
the glossary (in section 2)
gives definitions of several terms that suggest the document is
attempting to define the semantics, but the definitions given are
nowhere near sufficient to specify those semantics.
The glossary doesn't explain every aspect of timekeeping from the ground
up and isn't really the place to do so, as it would take a lot of space
and our readers are likely to be reasonably familiar with timekeeping
already. That being said, we can fix specific issues with definitions as
they come up (including the issues you noted). Also, the attached
proposed patch prefaces the glossary with a pointer to tz-link to try to
help readers unfamiliar with timekeeping.
3. Apparently (see comments on section 4), characters outside the
ASCII set are allowed in time zone designations. If so, their
encoding needs to be specified.
The spec deliberately does not have a MUST for encoding. All the format
requires is a string of octets terminated by an 0x00 octet. The attached
proposed patch adds some text to say this explicitly. (Section 4 does
say the encoding SHOULD be ASCII, and the patch doesn't change this.)
The sequence "nor does it define the source the time zone data" is
gramatically incorrect.
Noted earlier with a proposed fix in the
<https://www.ietf.org/mail-archive/web/tzdist-bis/current/msg00339.html>
email cited above.
Also, it's not clear how "as defined in Section
3 of [RFC7808]" attaches to the sentence, since the closest item in
the list, "versions", is defined in section 3 and not RFC 7808.
As far as I'm concerned the sentence is mostly a red herring, and we can
reduce confusion by dropping the laundry list and the most-irrelevant
citation of RFC 7808, as done in the attached proposed patch.
Daylight Saving Time (DST): The time according to a location's law
or practice, adjusted as necessary from standard time. The
adjustment may be positive, negative, or zero.
This seems to read that "Daylight Saving Time" is not "standard time"
plus the summer time offset, but standard time, adjusted by whatever
summer time offset might be in effect at the moment. But that is not
not the definition of DST, at least, not as commonly used in the US.
In general there is no single "summer time offset"; some localities have
used different offsets at different points in the summer. The attached
proposed patch attempts to make this clearer.
2. a change in whether standard or daylight saving time is in use
Is this intended to include the regular transition between summer time
and winter time, or only when a locality starts or stops the practice
of using summer time each year or the schedule of transitions?
The former. The attached proposed patch rewords this.
Wall Time: The time as shown on a clock set according to a
location's law or practice.
In what way does this differ from "Local Time"?
It doesn't. Good catch. Fixed in the attached proposed patch, which also
tries to nail down "local time" better.
'2' (0x32) Version 2 - The file MUST contain the 32-bit header
and data block, a 64-bit header and data block, and a footer.
The phrases "32-bit header" and "64-bit header" don't work, as the
headers have the same format. They could be called "header for 32-bit
data", etc., but that's a bit awkward. Alternatively, names for all
of the data sections could be defined in section 2.
The attached proposed patch attempts to address this by using the names
"version 1 header" and "version 2+ header", and similarly for data blocks.
typecnt: A four-octet unsigned integer specifying the number of
local time type records contained in the data block - MUST NOT be
zero. (Although time type 0 is not used in files that have
nonempty TZ strings but no transitions, it is nevertheless
required because many TZif readers reject files that lack time
types.)
This is hard to understand.
Reworded in the attached proposed patch.
* "application/tzif-leap" (Section 8.2) to indicate that leap
second records are included in the TZif data.
You probably need to spell out: "For version 1 files, leap second
records must be present in the 32-bit data block; for version 2 and 3
files, leap second records must be present in the 64-bit data block."
Leap second records need not be present even in application/tzif-leap
files, so the attached draft changes this to "leap second records are
included in the TZif data as necessary (none are necessary if the file
is truncated to a range that precedes the first leap second)".
10.3. URIs
This section is odd. Item [1] is just the URL for BCP 14, but BCP 14
is also listed in the references section. Items [6] and [7] duplicate
[8] and [9], but they're references and should be treated as such.
Items [3], [4], and [5] are more interesting -- they're locations
inside reference [POSIX]. The Editor should be able to suggest a
better way of presenting these references.
The cross references do not work the way I would like
<https://www.ietf.org/mail-archive/web/tzdist-bis/current/msg00326.html>
and apparently can't be done well
<https://www.ietf.org/mail-archive/web/tzdist-bis/current/msg00328.html>. I
gave up on doing them well, but if someone more expert in
XML/HTML/whatever can fix things I'll be a happy camper.
Appendix A. Common Interoperability Issues
Most of these are problems in generating TZif files
for use by readers conforming to predecessors of this specification.
It would be helpful to those dealing with compatibility issues to have
references to the predecessors of this specification.
The predecessors are published in the version-controlled tzfile.5 man
pages in the tzdb development version
<https://github.com/eggert/tz/commits/master/tzfile.5>. The attached
patch hyperlinks to that.
--- draft-murchison-tzdist-tzif-14-02.xml 2018-09-27 08:11:16.386765915 -0700
+++ draft-murchison-tzdist-tzif-14-03.xml 2018-10-05 13:50:34.752957461 -0700
@@ -117,10 +117,8 @@
</t>
- <t>This specification does not define the source of leap second
- information, nor does it define the source of the time zone
- data, metadata, identifiers, aliases, localized names, or
- versions as defined in Section 3 of <xref target='RFC7808'/>.
+ <t>This specification does not define the source of the data
+ assembled into a TZif file.
One such source is the IANA-hosted time zone database <xref
target="RFC6557" />.</t>
</section> <!-- Intro -->
@@ -133,7 +131,11 @@
<xref target='RFC2119' /> <xref target='RFC8174' />
when, and only when, they appear in all capitals, as shown here.</t>
- <t>The following terms are used in this document:
+ <t>The following terms are used in this document.
+ See <xref target="tz-link">"Sources for Time Zone and Daylight
+ Saving Time Data"</xref> for more-detailed information about
+ civil timekeeping data and practice.
+
<list style="hanging">
<t hangText="Coordinated Universal Time (UTC):">
The basis for civil time since 1960.
@@ -142,8 +144,11 @@
</t>
<t hangText="Daylight Saving Time (DST):">
The time according to a location's law or practice,
- adjusted as necessary from standard time. The adjustment
- may be positive, negative, or zero.
+ when adjusted as necessary from standard time. The
+ adjustment may be positive or negative, and the amount of
+ adjustment may vary depending on the date and time; the TZif
+ format even allows the adjustment to be zero, although this
+ is not common practice.
</t>
<t hangText="International Atomic Time (TAI):">
The time standard based on atomic clocks since 1972.
@@ -159,8 +164,8 @@
second.
</t>
<t hangText="Local Time:">
- The time according to a location's current time zone
- offset from Universal Time.
+ Civil time for a particular location. Its offset
+ from Universal Time can depend on the date and time of day.
</t>
<t hangText="POSIX Epoch:">
1970-01-01 00:00:00 UTC, the basis for absolute timestamps
@@ -175,8 +180,8 @@
or more of the following happen simultaneously:
<list style="numbers">
<t>a change in UT offset</t>
- <t>a change in whether standard or daylight saving time is
- in use</t>
+ <t>a change in whether daylight saving time is
+ in effect</t>
<t>a change in time zone abbreviation</t>
<t>a leap second (i.e., a change in LEAPCORR)</t>
</list>
@@ -221,15 +226,14 @@
the timestamp 1972-07-01 00:00:00 UTC would be 78796801, one
greater than the UNIX time for the same timestamp.
Similarly, if the second leap second record occurs at
- 1972-12-31 23:59:60 UTC, its UNIX leap time would be
- 94694401; the second occurrence accounts for the first leap
- second.
+ 1972-12-31 23:59:60 UTC it accounts for the first leap second,
+ so the UNIX leap time of 1972-12-31 23:59:60 UTC would be 94694401
+ and the Unix leap time of 1973-01-01 00:00:00 UTC would be 94694402.
If a TZif file specifies no leap second records,
- UNIX leap time is equivalent to UNIX time.
+ UNIX leap time is equal to UNIX time.
</t>
<t hangText="Wall Time:">
- The time as shown on a clock set according to a location's
- law or practice.
+ Another name for local time; short for "wall clock time".
</t>
</list>
</t>
@@ -238,77 +242,80 @@
<section anchor='format'
title='The Time Zone Information Format (TZif)'>
<t>The time zone information format begins with a fixed 44-octet
- <xref target='header'>header</xref> followed by a
- variable-length <xref target='data'>data block</xref> using
+ <xref target='header'>version 1 header</xref>
+ containing a field that specifies the version
+ of the file's format. Readers designed for version N can read
+ version N+1 files without too much trouble; data specific to
+ version N+1 either typically appears after version N data so
+ that earlier-version readers can easily ignore later-version
+ data they are not designed for, or it appears as a minor
+ extension to version N that version N readers are likely to
+ tolerate well.</t>
+
+ <t>The version 1 header is followed by a variable-length
+ <xref target='data'>version 1 data block</xref> containing
four-octet (32-bit) transition times and leap second
occurrences. These 32-bit values are limited to representing
time changes from 1901-12-13 20:45:52 through 2038-01-19 03:14:07 UT,
- and the initial header and data block are present only for backward
+ and the version 1 header and data block are present only for backward
compatibility with obsolescent readers as discussed in
<a xref='issues'>Common Interoperability Issues</a>.</t>
- <t>The TZif header contains a field which specifies the version
- of the file's format. Version 1 files terminate after the
- 32-bit data block.</t>
-
- <t>Version 2 and 3 files extend the format by appending
- a second 44-octet header, another variable-length data block
- using eight-octet (64-bit) transition times and leap second
+ <t>Version 1 files terminate after the version 1 data block.
+ Version 2 and 3 files extend the format by appending a second
+ 44-octet version 2+ header, a variable-length version 2+ data block
+ containing eight-octet (64-bit) transition times and leap second
occurrences, and a variable length
<xref target='footer'>footer</xref>.
These 64-bit values can represent times approximately 292
billion years into the past or future.</t>
+ <t>
+ NOTE: All multi-octet integer values MUST be stored in
+ network octet order format (high-order octet first, otherwise
+ known as big-endian), with all bits significant. Signed
+ integer values MUST be represented using two's complement.
+ </t>
+
<t>A TZif file is structured as follows:</t>
<figure title='General Format of TZif Files'>
<artwork type='inline' align='center'><![CDATA[
Version 1 Versions 2 & 3
+-------------+ +-------------+
- | Header for | | Header for |
- | 32-bit | | 32-bit |
- | Transitions | | Transitions |
+ | Version 1 | | Version 1 |
+ | Header | | Header |
+-------------+ +-------------+
- | Data with | | Data with |
- | 32-bit | | 32-bit |
- | Transitions | | Transitions |
+ | Version 1 | | Version 1 |
+ | Data Block | | Data Block |
+-------------+ +-------------+
- | Header for |
- | 64-bit |
- | Transitions |
+ | Version 2+ |
+ | Header |
+-------------+
- | Data with |
- | 64-bit |
- | Transitions |
+ | Version 2+ |
+ | Data Block |
+-------------+
| Footer |
+-------------+
]]></artwork>
</figure>
- <t>
- NOTE: All multi-octet integer values MUST be stored in
- network octet order format (high-order octet first, otherwise
- known as big-endian), with all bits significant. Signed
- integer values MUST be represented using two's complement.
- </t>
-
<section anchor='header' title='TZif Header'>
- <t>The TZif header is structured as follows (the number
+ <t>A TZif header is structured as follows (the number
of octets occupied by a field is shown in parenthesis):</t>
<figure title='TZif Header'>
<artwork type='inline' align='center'><![CDATA[
- +---------------+---+
- | magic (4) |ver|
- +---------------+---+---------------------------------------+
- | [unused - reserved for future use] (15) |
- +---------------+---------------+---------------+-----------+
- | isutcnt (4) | isstdcnt (4) | leapcnt (4) |
- +---------------+---------------+---------------+
- | timecnt (4) | typecnt (4) | charcnt (4) |
- +---------------+---------------+---------------+
+ +-------------------+----+
+ | magic (4) |v(1)|
+ +-------------------+----+-------------------------------------------------+
+ | [unused - reserved for future use] (15) |
+ +-------------------+-------------------+-------------------+--------------+
+ | isutcnt (4) | isstdcnt (4) | leapcnt (4) |
+ +-------------------+-------------------+-------------------+
+ | timecnt (4) | typecnt (4) | charcnt (4) |
+ +-------------------+-------------------+-------------------+
]]></artwork>
</figure>
@@ -321,20 +328,20 @@
which identifies the file as utilizing the Time Zone
Information Format.
</t>
- <t hangText="ver(sion):">
+ <t hangText="v(ersion):">
An octet identifying the version of the
file's format. The value MUST be one of the following:
<list style='hanging'>
<t hangText='NUL (0x00)'>
- Version 1 - The file contains only the 32-bit
+ Version 1 - The file contains only the version 1
header and data block.
- Version 1 files MUST NOT contain a 64-bit header,
+ Version 1 files MUST NOT contain a version 2+ header,
data block, or footer.
</t>
<t hangText="'2' (0x32)">
- Version 2 - The file MUST contain the 32-bit
- header and data block, a 64-bit header and data
+ Version 2 - The file MUST contain the version 1
+ header and data block, a version 2+ header and data
block, and a footer.
The TZ string in the
<xref target='footer'>footer</xref>, if
@@ -349,8 +356,8 @@
ASCII.
</t>
<t hangText="'3' (0x33)">
- Version 3 - The file MUST contain the 32-bit
- header and data block, a 64-bit header and data
+ Version 3 - The file MUST contain the version 1
+ header and data block, a version 2+ header and data
block, and a footer.
The TZ string in the
<xref target='footer'>footer</xref>, if
@@ -383,10 +390,11 @@
A four-octet unsigned integer specifying the number of
local time type records contained in the data block -
MUST NOT be zero.
- (Although time type 0 is not used in files that have
- nonempty TZ strings but no transitions, it is nevertheless
+ (Although local time type records convey no useful
+ information in files that have nonempty TZ strings but
+ no transitions, at least one such record is nevertheless
required because many TZif readers reject files that
- lack time types.)
+ have zero time types.)
</t>
<t hangText="charcnt:">
A four-octet unsigned integer specifying the total number
@@ -398,26 +406,26 @@
</list>
</t>
- <t>Although the 32- and 64-bit headers have the same format
+ <t>Although the version 1 and 2+ headers have the same format
with the same magic number and version fields, their count
- fields can differ because the 32-bit data can be a subset of
- the 64-bit data.</t>
+ fields can differ because the version 1 data can be a subset of
+ the version 2+ data.</t>
</section> <!-- header -->
<section anchor='data' title='TZif Data Block'>
- <t>The TZif data block consists of seven variable-length
+ <t>A TZif data block consists of seven variable-length
elements, each of which is series of zero or more items. The
number of items in each series is determined by the
corresponding count field in the header. The total length of
each element is calculated by multiplying the number of items
by the size of each item. Therefore, implementations that
- do not wish to parse or use the 32-bit data block can
+ do not wish to parse or use the version 1 data block can
calculate its total length and skip directly to the header of
- the 64-bit data block.</t>
+ the version 2+ data block.</t>
- <t>In the initial data block, time values are 32-bit
- (TIME_SIZE = 4 octets). In the second data block, present
+ <t>In the version 1 data block, time values are 32-bit
+ (TIME_SIZE = 4 octets). In the version 2+ data block, present
only in version 2 and 3 files, time values are 64-bit
(TIME_SIZE = 8 octets).</t>
@@ -477,9 +485,9 @@
Each record has the following format:
<figure>
<artwork type='inline'><![CDATA[
- +---------------+-+-+---+
- | utoff (4) |dst|idx|
- +---------------+---+---+
+ +----------------------------+------+------+
+ | utoff (4) |dst(1)|idx(1)|
+ +----------------------------+------+------+
]]></artwork>
</figure>
@@ -501,17 +509,20 @@
A one-octet value indicating whether local
time should be considered Daylight Savings Time (DST).
The value MUST be 0 or 1.
- A value of one (1) indicates that DST is in effect.
- A value of zero (0) indicates that standard time in
- effect.
+ A value of one (1) indicates this time type is DST.
+ A value of zero (0) indicates that this time type
+ is standard time.
</t>
<t hangText="(desig)idx:">
A one-octet unsigned integer specifying an index into
the series of time zone designation octets,
thereby selecting a particular designation string.
Each index MUST be in the range [0, 'charcnt'-1], and
- MUST index a NUL-terminated (0x00) sequence of
- octets.
+ designates the NUL-terminated string of octets
+ starting at position 'idx' in the time zone
+ designations. (This string MAY be empty.) A NUL
+ octet MUST exist in the time zone designations at or
+ after position 'idx'.
</t>
</list>
</t>
@@ -522,6 +533,9 @@
'charcnt' field in the header.
Note that two designations MAY overlap if one is a
suffix of the other.
+ The character encoding of time zone designation strings
+ is not specified; however, see <xref target='interop'/>
+ of this document.
</t>
<t hangText="leap second records:">
A series of eight- or twelve-octet records specifying
@@ -533,7 +547,7 @@
'leapcnt' field in the header.
Each record has one of the following structures:
<list style='hanging'>
- <t hangText="32-bit Data Block:">
+ <t hangText="Version 1 Data Block:">
<figure>
<artwork type='inline'><![CDATA[
+---------------+---------------+
@@ -542,7 +556,7 @@
]]></artwork>
</figure>
</t>
- <t hangText="64-bit Data Block:">
+ <t hangText="version 2+ Data Block:">
<figure>
<artwork type='inline'><![CDATA[
+---------------+---------------+---------------+
@@ -649,9 +663,9 @@
<figure title='TZif Footer'>
<artwork type='inline' align='center'><![CDATA[
- +---+--------------------+---+
- | NL| TZ string (0...) |NL |
- +---+--------------------+---+
+ +-----+--------------------+-----+
+ |NL(1)| TZ string (0...) |NL(1)|
+ +-----+--------------------+-----+
]]></artwork>
</figure>
@@ -663,7 +677,7 @@
</t>
<t hangText="TZ string:">
A rule for computing local time changes after the last
- transition time stored in the 64-bit data block.
+ transition time stored in the version 2+ data block.
The string is either empty or uses the expanded format
of the "TZ" environment variable as defined in
<eref
@@ -676,8 +690,8 @@
If the string is empty, the corresponding information is
not available.
If the string is nonempty and one or more transitions
- appear in the 64-bit data, the string MUST be
- consistent with the last 64-bit transition - i.e.,
+ appear in the version 2+ data, the string MUST be
+ consistent with the last version 2+ transition - i.e.,
evaluating the TZ string at the time of the last
transition should yield the same time type as the time
type specified in the last transition.
@@ -751,20 +765,24 @@
SHOULD NOT be generated, as they do not support transition
times after the year 2038.</t>
+ <t>Implementations that only understand Version 1 MUST ignore
+ any data that extends beyond the calculated end of the version
+ 1 data block.</t>
+
<t>Implementations SHOULD generate a version 3 file if
TZ string extensions are necessary to accurately
model transition times.
Otherwise, version 2 files SHOULD be generated.</t>
- <t>The sequence of time changes defined by the 32-bit
+ <t>The sequence of time changes defined by the version 1
header and data block SHOULD be a contiguous subsequence
- of the time changes defined by the 64-bit header and data
+ of the time changes defined by the version 2+ header and data
block, and by the footer.
This guideline helps obsolescent version 1 readers
agree with current readers about timestamps within the
contiguous subsequence. It also lets writers not
- catering to obsolescent readers use a 'timecnt' of zero
- in the 32-bit data to save space.</t>
+ supporting obsolescent readers use a 'timecnt' of zero
+ in the version 1 data block to save space.</t>
<t>Time zone designations SHOULD consist of at least three (3)
and no more than six (6) ASCII characters from the set of
@@ -773,11 +791,11 @@
time zone abbreviations.</t>
<t>When reading a version 2 or 3 file, implementations
- SHOULD ignore the 32-bit header and data block except for
+ SHOULD ignore the version 1 header and data block except for
the purpose of skipping over them.</t>
<t>Implementations SHOULD calculate the total lengths of the
- 32- and 64-bit headers and data blocks and compare them against
+ headers and data blocks and check that they all fit within
the actual file size, as part of a validity check for the file.</t>
<t>When a TZif file is used in a MIME message entity it SHOULD
@@ -786,7 +804,8 @@
<list style='symbols'>
<t><xref target="tzif-leap">"application/tzif-leap"</xref>
to indicate that leap second records are included in the
- TZif data.</t>
+ TZif data as necessary (none are necessary if the file is truncated
+ to a range that precedes the first leap second).</t>
<t><xref target="tzif">"application/tzif"</xref>
to indicate that leap second records are not included in the
@@ -831,17 +850,17 @@
<t>As described in Section 3.9 of <xref target="RFC7808"/>,
a TZDIST service MAY truncate time zone transition data.
A truncated TZif file is valid from its first and up to, but
- not including, its last 64-bit transition time, if present.</t>
+ not including, its last version 2+ transition time, if present.</t>
<t>When truncating the start of a TZif file, the service MUST
- supply in the 64-bit data a first transition time that is
+ supply in the version 2+ data a first transition time that is
the start point of the truncation range.
As with untruncated TZif files, time type 0 indicates local
time immediately before the start point, and the time type of
the first transition indicates local time thereafter.</t>
<t>When truncating the end of a TZif file, the service MUST
- supply in the 64-bit data a last transition time that is
+ supply in the version 2+ data a last transition time that is
the end point of the truncation range, and MUST supply an
empty TZ string.
As with untruncated TZif files with empty TZ strings, a
@@ -1120,7 +1139,9 @@
<t>This section documents common problems in implementing this
specification.
Most of these are problems in generating TZif files for use by
- readers conforming to predecessors of this specification.
+ readers conforming to
+ <eref target='https://github.com/eggert/tz/commits/master/tzfile.5'>
+ predecessors of this specification</eref>.
The goals of this section are:
<list style='numbers'>
@@ -1149,11 +1170,11 @@
<t>Interoperability problems with TZif include the following:
<list style='symbols'>
- <t>Some readers examine only 32-bit data.
- As a partial workaround, a writer can output as much 32-bit
+ <t>Some readers examine only version 1 data.
+ As a partial workaround, a writer can output as much version 1
data as possible.
- However, a reader should ignore 32-bit data, and should use
- 64-bit data even if the reader's native timestamps have only
+ However, a reader should ignore version 1 data, and should use
+ version 2+ data even if the reader's native timestamps have only
32 bits.</t>
<t>Some readers designed for version 2 might mishandle