Re: [PATCH v2] mctp.7: Add man page for Linux MCTP support

"Alejandro Colomar (man-pages)" <alx.manpages@xxxxxxxxx> · Fri, 12 Nov 2021 19:45:16 +0100

Hi Jeremy,

On 11/12/21 02:12, Jeremy Kerr wrote:
Hi Alex,

Thanks for the review! I've updated in line with most of your comments,
and will send a v3 soon. However, I do have a couple of queries, mainly
for my own understanding:

Sure.

+.SH SYNOPSIS
+.nf
+.B #include <sys/socket.h>
+.B #include <linux/mctp.h>

I prefer alphabetic sorting of includes.

OK, does that take priority over the convention to list the header(s)
specific to this man page last?

I didn't even know there was such a convention, but if there is, yes, I 
explicitly want to override it.

Rationales:

<https://google.github.io/styleguide/cppguide.html#Include_What_You_Use>
<https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes>
<https://stackoverflow.com/a/2763292/6872717>
<https://stackoverflow.com/a/2762626/6872717>

Basically, alphabetic order avoids undeclared dependencies that can be 
hidden in some specific orderings which "just work but if you reorder 
them it breaks".  Of course, it can still hide some dependency, but it's 
more unlikely.  It's a "this can be random order, but let's use 
something more readable and consistent than random".

Unless for some exception that I can't remember now, Google's style 
guide applies to includes in the man-pages (or I intend it to be so).

In that case, we end up with:

     #include <linux/mctp.h>
     #include <sys/socket.h> /* Definition of socket() & SOCK_DGRAM */

Since <sys/socket.h> provides the prototype socket(), it's 
<linux/mctp.h> that should specify why it's needed, so it should be

      #include <linux/mctp.h>  /* Definition of AF_MCTP */
      #include <sys/socket.h>

Also, please use at least 2 spaces between code and comments (unless the 
line goes close to the 78 column (right margin), in which case, reducing 
that space is the most readable thing to do).

Rationale:

<https://google.github.io/styleguide/cppguide.html#Horizontal_Whitespace>

That helps visually separate code from non-code.

+.PP
+The API for MCTP messaging uses a standard sockets interface, using the
+.BR sendto (2)
+and
+.BR recvfrom (2)
+classes of system calls to transfer messages.
+Messages may be fragmented into packets before transmission, and reassembled at
+the remote endpoint.

Break at the comma.

Just this comma? or all of them? There's a couple of sentences right
before this one that would seem to have a similar style - if it's the
former, for my own learning here: what makes this one different?

There are a few more that I missed, that's right:

[
> +This is a connectionless protocol, typically used between devices 
within a
> +server system.
> +Message reliability and ordering are not guaranteed, but message 
boundaries are
> +preserved.
]

Those should also be broken at the comma.  Rationale: semantic newlines 
(man-pages(7)).

In the case of the following one, although you could break at it if you 
want (maybe better for consistency), I won't enforce it too much, since 
it is a couple of words and the line is already broken in a non-semantic 
way due to the formatting.  So I don't care too much:

[
> +The API for MCTP messaging uses a standard sockets interface, using the
> +.BR sendto (2)
]

As Branden said, you can use "/[;:,]." and "/[!?.][^\\]" to check the 
correct use of semantic newlines.

[and you mean a line-break, right? or a break-point escape sequence?]

Yes, line break.

+Packets between a local and remote endpoint are identified by the source

Break after "by" (or perhaps just before it).

Same as the above, why is this?

This is more or less for the same reasons as above, semantic newlines, 
but it goes a bit deeper.  Branden and I discussed a few months ago 
about my strong preference for semantic newlines not only in clause 
boundaries but also phrase boundaries.

man-pages(7) recommends breaking long lines at clause boundaries 
(commas, semicolons, and so on), but somethimes clauses (if you don't 
know the difference between phrases and clauses, which you don't need 
to, basically clauses are made up of phrases) are too long, and you can 
have a single clause that uses more than a single line.  In those cases, 
the most sensible place to break the line is at the next level: phrase 
boundaries.

"the source and destination EIDs" is a single phrase, so it's a bit 
weird to break the line in the middle of it.  I avoid breaking phrases, 
which makes reading the source code a bit more difficult.  Hopefully, it 
will also make diffs easier to read in the future.

+struct sockaddr_mctp {
+    unsigned short     smctp_family;  /* = AF_MCTP */
+    uint16_t           __smctp_pad0;  /* pad, must be zero */
+    int                smctp_network; /* local network identifier */
+    struct mctp_addr   smctp_addr;    /* EID */
+    uint8_t            smctp_type;    /* message type byte */
+    uint8_t            smctp_tag;     /* tag value, including TO flag */
+    uint8_t            __smctp_pad1;  /* pad, must be zero */

Do we want to tie the implementation to this pad?

Yes. The pad will be there anyway, due to the natural alignment of the
struct. Since we want to be explicit about the padding (and require it
to be zeroed), I would strongly suggest keeping it documented.

If there was padding in the middle of the struct, yes, it should 
definitely be documented in the man page.

There is an 'extended' MCTP addressing struct, which encapsulates a
struct sockaddr_mctp. For us to be absolutely clear about the layout of
that structure, the explicit pad here makes that unambiguous.

What I mean is, if in the future this structure will have additional 
trailing fields, documenting this padding is unnecessary, since that may 
vary.  Code should not rely on this structure having _only_ that 
padding.  And if code handles any arbitrary extra stuff in this 
structure, it will implicitly also handle that __smctp_pad1 field, so 
there's no need to mention it.

Example:

struct sockaddr_mctp {
    unsigned short     smctp_family;  /* = AF_MCTP */
    uint16_t           __smctp_pad0;  /* pad, must be zero */
    int                smctp_network; /* local network identifier */
    struct mctp_addr   smctp_addr;    /* EID */
    uint8_t            smctp_type;    /* message type byte */
    uint8_t            smctp_tag;     /* tag value, including TO flag */
    uint8_t            foo;           /* was __smctp_pad1 */
    uint8_t            bar;           /* extra stuff */
};

Here I got rid of the pad, and even added an extra field.  Code should 
be written to be compatible with this case, right?  If so, I don't see 
any reason to document that padding field, IMHO.

Also, we prevent some crazy programmers from relying on that padding 
byte being actually padding and not something else, even if it "must" be 
zero.  I've seen too much crazy stuff; programmers relying on undefined 
behavior just because "we don't plan to move from C++17 to C++20, so 
this is safe".

[unless, for man pages, we don't care about the ABI, only the API?]

We care about the ABI.  Especially, if it's about a type that we control.

In the case of ISO C or POSIX types, system_data_types(7) doesn't 
document Linux-specific details, for portability reasons, but that's an 
exception, not the rule.

Future implementations of sockaddr_mctp are not going to use that
byte for anything else?

They might, hence requiring zero at present.

Okay, then code should be able to handle _any_ trailing fields, 
including that padding, so documenting it is irrelevant, I think.  We 
could say something like "trailing fields may be added to this 
structure", and that already implies the current padding byte, doesn't it?

Cheers,
Alex

--
Alejandro Colomar
Linux man-pages comaintainer; http://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/