Re: [PATCH] trace2: increment event format version

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 01 2021, Jeff Hostetler wrote:

> On 11/12/21 5:33 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Nov 11 2021, Junio C Hamano wrote:
>> 
>>> Josh Steadmon <steadmon@xxxxxxxxxx> writes:
>>>
>>>> On 2021.11.11 15:03, Junio C Hamano wrote:
>>>>> Josh Steadmon <steadmon@xxxxxxxxxx> writes:
>>>>>
>>>>>> In 64bc752 (trace2: add trace2_child_ready() to report on background
>>>>>> children, 2021-09-20), we added a new "child_ready" event. In
>>>>>> Documentation/technical/api-trace2.txt, we promise that adding a new
>>>>>> event type will result in incrementing the trace2 event format version
>>>>>> number, but this was not done. Correct this in code & docs.
>>>>>>
> ...
>> On the field itself I also wonder if it's useful at all. I'd think
>> anyone implementing a parser for the format would dispatch to a lookup
>> handling known keys, so having a version indicating "new keys here"
>> seems rather useless.
>> 
>
> That may be true, but it is easier to have a version number and
> allow parsers to ignore it, than it is to not have a version number
> and at some point in the future require parsers try to figure it
> out. IMHO.
>
> So far we've only added new event types (cmd_ancestry and child_ready)
> and everything is in regular JSON forms, so parsing and dispatching
> is relatively easy and the version number is not really needed.
> But, if in the future we decide to change the contents within one of
> those events, then the version number may be significant.

Yes, exactly. Versioning it makes sense, I'm commenting specifically on
the criteria for bumping the version. I.e. this bit in
trace2/tr2_tgt_event.c (with my comments)::
    
     * The version should be incremented if new event types are added

That seems entirely useless, i.e. just ignore unknown types. If you get
an unknown type you know something is new, no need for a version number
bump for this case.

It adds no new information you're not getting with "oh? a new field?".

     * if existing fields are removed, [...]

Useful, you don't always know if the absence of a field means it won't
be there at all, or if something else is up, particularly if the field
is optional. E.g. if we rename "error" to "exception" (or whatever) we
don't want an older consumer to report that there's no errors happening.

     * [...]or if there are significant changes in
     * interpretation of existing events or fields.

Carving out "significant changes" seems rather odd.

Let's just say "any changes"? E.g. I don't think my 2d3491b117c (tr2:
log N parent process names on Linux, 2021-08-27) qualifies as
significant change, but why not bump the version even for that?

In that case it was part of a release where we bumped the version
anyway, but let's come up with some sane criteria.

What's "significant" is also entirely subjective. I think 2d3491b117c is
trivial in the context of the entirety of our trace2 data. But if all
you care about is logging process trees it's highly significant.

     * Smaller changes, such as adding
     * a new field to an existing event, do not require an increment to the EVENT
     * format version.

I'd think a new sub-field would be in the same class as a new top-level
event field, i.e. adding a version wouldn't add any information. It
falls under the same general rule that you can traverse future data with
your "version" in hand, and a whitelist of fields you know how to
handle. If the version is the same but you've got new unknown fields you
shouldn't need to care.

IOW I think this would make more sense as a version bumping criteria:

    The version should be incremented whenever an existing consumer of
    trace2 data might want to act differently based on the new data.

    An exception to this is that any new event types do not merit
    bumping the version number. E.g. we have a top-level event type
    "error" now, but might hypothetically add a new "warning" type.
    Such an addition won't require bumping the version.

    Likewise adding new mandatory fields to existing events doesn't
    require bumping the version. E.g. the "error" type has (as of
    writing) a "fmt" and "msg" field. Let's say a future version adds an
    "id" (as in unique id for the error) field, such an addition won't
    require bumping the version.

    In other words, consumers of the trace2 JSON format are expected to
    walk the structure and only pick those things that they know about.
    Any unknown fields the consumer doesn't know about can be safely
    discarded. This won't apply if the version is bumped, then all bets
    are off, and the meaning of existing fields may or may not have
    changed.

    The idea is to encourage additive changes over changes to existing
    fields, and to reduce the work in maintaining the consumers of the
    format.

    As long as consumers ignore new unknown data they won't need to
    be updated every time the format changes in any way, only for
    potentially backwards-incompatible changes.

Wouldn't this be a saner policy for version bumping? AFAICT the only
thing you wouldn't be getting that you're getting now is the trivial
optimization of being able to say cheaply route trace2 payloads based on
version number alone (but even that is iffy now due to the subjectivity
of "significant change").




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux