Hi Jeremy,
On 11/12/21 02:12, Jeremy Kerr wrote:
Hi Alex,
Thanks for the review! I've updated in line with most of your comments,
and will send a v3 soon. However, I do have a couple of queries, mainly
for my own understanding:
Sure.
+.SH SYNOPSIS
+.nf
+.B #include <sys/socket.h>
+.B #include <linux/mctp.h>
I prefer alphabetic sorting of includes.
OK, does that take priority over the convention to list the header(s)
specific to this man page last?
I didn't even know there was such a convention, but if there is, yes, I
explicitly want to override it.
Rationales:
<https://google.github.io/styleguide/cppguide.html#Include_What_You_Use>
<https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes>
<https://stackoverflow.com/a/2763292/6872717>
<https://stackoverflow.com/a/2762626/6872717>
Basically, alphabetic order avoids undeclared dependencies that can be
hidden in some specific orderings which "just work but if you reorder
them it breaks". Of course, it can still hide some dependency, but it's
more unlikely. It's a "this can be random order, but let's use
something more readable and consistent than random".
Unless for some exception that I can't remember now, Google's style
guide applies to includes in the man-pages (or I intend it to be so).
In that case, we end up with:
#include <linux/mctp.h>
#include <sys/socket.h> /* Definition of socket() & SOCK_DGRAM */
Since <sys/socket.h> provides the prototype socket(), it's
<linux/mctp.h> that should specify why it's needed, so it should be
#include <linux/mctp.h> /* Definition of AF_MCTP */
#include <sys/socket.h>
Also, please use at least 2 spaces between code and comments (unless the
line goes close to the 78 column (right margin), in which case, reducing
that space is the most readable thing to do).
Rationale:
<https://google.github.io/styleguide/cppguide.html#Horizontal_Whitespace>
That helps visually separate code from non-code.
+.PP
+The API for MCTP messaging uses a standard sockets interface, using the
+.BR sendto (2)
+and
+.BR recvfrom (2)
+classes of system calls to transfer messages.
+Messages may be fragmented into packets before transmission, and reassembled at
+the remote endpoint.
Break at the comma.
Just this comma? or all of them? There's a couple of sentences right
before this one that would seem to have a similar style - if it's the
former, for my own learning here: what makes this one different?
There are a few more that I missed, that's right:
[
> +This is a connectionless protocol, typically used between devices
within a
> +server system.
> +Message reliability and ordering are not guaranteed, but message
boundaries are
> +preserved.
]
Those should also be broken at the comma. Rationale: semantic newlines
(man-pages(7)).
In the case of the following one, although you could break at it if you
want (maybe better for consistency), I won't enforce it too much, since
it is a couple of words and the line is already broken in a non-semantic
way due to the formatting. So I don't care too much:
[
> +The API for MCTP messaging uses a standard sockets interface, using the
> +.BR sendto (2)
]
As Branden said, you can use "/[;:,]." and "/[!?.][^\\]" to check the
correct use of semantic newlines.
[and you mean a line-break, right? or a break-point escape sequence?]
Yes, line break.
+Packets between a local and remote endpoint are identified by the source
Break after "by" (or perhaps just before it).
Same as the above, why is this?
This is more or less for the same reasons as above, semantic newlines,
but it goes a bit deeper. Branden and I discussed a few months ago
about my strong preference for semantic newlines not only in clause
boundaries but also phrase boundaries.
man-pages(7) recommends breaking long lines at clause boundaries
(commas, semicolons, and so on), but somethimes clauses (if you don't
know the difference between phrases and clauses, which you don't need
to, basically clauses are made up of phrases) are too long, and you can
have a single clause that uses more than a single line. In those cases,
the most sensible place to break the line is at the next level: phrase
boundaries.
"the source and destination EIDs" is a single phrase, so it's a bit
weird to break the line in the middle of it. I avoid breaking phrases,
which makes reading the source code a bit more difficult. Hopefully, it
will also make diffs easier to read in the future.
+struct sockaddr_mctp {
+ unsigned short smctp_family; /* = AF_MCTP */
+ uint16_t __smctp_pad0; /* pad, must be zero */
+ int smctp_network; /* local network identifier */
+ struct mctp_addr smctp_addr; /* EID */
+ uint8_t smctp_type; /* message type byte */
+ uint8_t smctp_tag; /* tag value, including TO flag */
+ uint8_t __smctp_pad1; /* pad, must be zero */
Do we want to tie the implementation to this pad?
Yes. The pad will be there anyway, due to the natural alignment of the
struct. Since we want to be explicit about the padding (and require it
to be zeroed), I would strongly suggest keeping it documented.
If there was padding in the middle of the struct, yes, it should
definitely be documented in the man page.
There is an 'extended' MCTP addressing struct, which encapsulates a
struct sockaddr_mctp. For us to be absolutely clear about the layout of
that structure, the explicit pad here makes that unambiguous.
What I mean is, if in the future this structure will have additional
trailing fields, documenting this padding is unnecessary, since that may
vary. Code should not rely on this structure having _only_ that
padding. And if code handles any arbitrary extra stuff in this
structure, it will implicitly also handle that __smctp_pad1 field, so
there's no need to mention it.
Example:
struct sockaddr_mctp {
unsigned short smctp_family; /* = AF_MCTP */
uint16_t __smctp_pad0; /* pad, must be zero */
int smctp_network; /* local network identifier */
struct mctp_addr smctp_addr; /* EID */
uint8_t smctp_type; /* message type byte */
uint8_t smctp_tag; /* tag value, including TO flag */
uint8_t foo; /* was __smctp_pad1 */
uint8_t bar; /* extra stuff */
};
Here I got rid of the pad, and even added an extra field. Code should
be written to be compatible with this case, right? If so, I don't see
any reason to document that padding field, IMHO.
Also, we prevent some crazy programmers from relying on that padding
byte being actually padding and not something else, even if it "must" be
zero. I've seen too much crazy stuff; programmers relying on undefined
behavior just because "we don't plan to move from C++17 to C++20, so
this is safe".
[unless, for man pages, we don't care about the ABI, only the API?]
We care about the ABI. Especially, if it's about a type that we control.
In the case of ISO C or POSIX types, system_data_types(7) doesn't
document Linux-specific details, for portability reasons, but that's an
exception, not the rule.
Future implementations of sockaddr_mctp are not going to use that
byte for anything else?
They might, hence requiring zero at present.
Okay, then code should be able to handle _any_ trailing fields,
including that padding, so documenting it is irrelevant, I think. We
could say something like "trailing fields may be added to this
structure", and that already implies the current padding byte, doesn't it?
Cheers,
Alex
--
Alejandro Colomar
Linux man-pages comaintainer; http://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/