Re: [PATCH 1/1] rseq.2: New man page for the rseq(2) API

Alejandro Colomar <alx.manpages@xxxxxxxxx> · Wed, 15 Feb 2023 18:16:49 +0100

Hi Mathieu,

On 2/15/23 18:09, Mathieu Desnoyers wrote:
> On 2023-02-14 20:20, G. Branden Robinson wrote:
>> [CC list violently trimmed; for those who remain, this is mostly man
>> page style issues]
> 
> [ Gently added linux-man to CC list. ;-) ]

:-)

> 
>>
>> At 2023-02-14T23:29:37+0100, Alejandro Colomar wrote:
>>> On 2/14/23 20:54, Mathieu Desnoyers wrote:
>>>> +per-thread data structure shared between kernel and user-space.
>>>
>>> This last 'user-space' is not adjectivated, so it should go without
>>> a hyphen, according to common English rules.
>>
>> +1
> 
> done
> 
> 
>>
>> Also I like your coinage.  "Adjectivated yeast" is reflexive and
>> tautological!
>>
>>>> +.RB ( "struct rseq" )
>>>
>>> We format types in italics, so this should be '.RI'.
>>
>> +1
> 
> OK, so it's italics for both types and arguments.
> 
> I will replace all the bold markers for "struct rseq" and "struct 
> rseq_cs" to italic in the description (but not in the synopsis section 
> and not in the code snippets).
> 
>>
>>>> +Only one
>>>> +.BR rseq ()
>>>> +ABI can be registered per thread, so user-space libraries and
>>>> +applications must follow a user-space ABI defining how to share this
>>>> +resource.
>>>
>>> Please use semantic newlines.  See man-pages(7):
>>>
>>>     Use semantic newlines
>>>         In  the source of a manual page, new sentences should be started on new
>>>         lines, long sentences should be split into lines at clause breaks (com‐
>>>         mas, semicolons, colons, and so on), and long clauses should  be  split
>>>         at  phrase  boundaries.   This convention, sometimes known as "semantic
>>>         newlines", makes it easier to see the effect of  patches,  which  often
>>>         operate at the level of individual sentences, clauses, or phrases.
>>
>> I think I've said this before, but, strictly, commas in particular can
>> separate things that are not clauses.  Clauses have subjects and
>> predicates.
>>
>> Might it be better to say simply:
>>
>>    Start each sentence on a new line.  Split long sentences where
>>    punctuated by commas, semicolons, and colons.
>>
>> With this there is not even any need to discuss "phrase boundaries".
>>
> 
> I've modified to:
> 
> Only one
> .BR rseq ()
> ABI can be registered per thread,
> so user-space libraries and applications must follow a user-space ABI
> defining how to share this resource.
> 
> Hopefully that's correct.

Yes, that's good.

> 
> 
>>> In the above lines, that would mean breaking after the comma,
>>> and not leaving resource in a line of its own.
>>
>> The latter is inevitably going to happen from time to time simply due to
>> sentence length and structure and the line length used by one's text
>> editor.  I don't think an "orphan word" (what typographers call this) is
>> symptomatic of anything in *roff source when filling is enabled.
>>
>>>> +The ABI defining how to share this resource between applications and
>>>> +libraries is defined by the C library.
>>>> +Allocation of the per-thread
>>>> +.BR rseq ()
>>>> +ABI and its registration to the kernel is handled by glibc since version
>>>> +2.35.
>>>> +.PP
>>>> +The
>>>> +.BR rseq ()
>>>> +ABI per-thread data structure contains a
>>>> +.I rseq_cs
>>>> +field which points to the currently executing critical section.
>>>
>>> currently-executing should probably use a hyphen
>>> (if I understood the line correctly).
>>
>> This is not the case, according to some style authorities.  Dave Kemper
>> convinced me of this on the groff list.
>>
>> Here is one resource.
>>
>> https://www.editorgroup.com/blog/to-hyphenate-or-not-to-hyphenate/
>>
>>> See an interesting discussion in the groff@ mailing list:
>>> <https://lists.gnu.org/archive/html/groff/2022-10/msg00015.html>
>>
>> That's not _squarely_ on point, as none of "block", "device", or "based"
>> is an adverb.  "Currently" is.
> 
> Leaving unchanged based on this discussion.
> 
>>
>>>> +For each thread, a single rseq critical section can run at any given
>>>> +point.
>>>> +Each critical section need to be implemented in assembly.
>>>
>>> needs?
>>
>> +1
> 
> done.
> 
>>
>>>> +.TP
>>>> +.B Structure alignment
>>>
>>> Let's remove the bold here.  It's not necessary for marking a constant
>>> or something that needs bold.  And the indentation is already making
>>> it stand out, so bold is a bit too much aggressive to the reader.
>>
>> I agree; if it wouldn't be styled in running text, it doesn't need
>> styling as a paragraph tag; it already stands out by dint of its
>> placement as a tag.
>>
>>>> +Its value should always be confirmed by reading the cpu_id field before
>>>
>>> cpu_id should be formatted (.I).
>>
>> +1
> 
> done
> 
>>
>>>> +user-space performs any side-effect
>>>> +(e.g. storing to memory).
>>>> +.IP
>>>> +This field is always guaranteed to hold a valid CPU number in the range
>>>> +[ 0 ..  nr_possible_cpus - 1 ].
>>>
>>> Please use interval notation:
>>> 	[0, nr_possible_cpus)
>>> or
>>> 	[0, nr_possible_cpus - 1]
>>> whichever looks better to you.
>>>
>>> We did some consistency fix recently:
>>> <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=147a60d792a5db8f3cb93ea16eefb73e16c1fb91>
>>>
>>> Also, do we have a more standard way of saying nr_possible_cpus?
>>> Should we say nproc?
> 
> nproc(1) means:
> 
>         Print  the number of processing units available to the current
>         process, which may be less than the number of online processors
> 
> Which is the number of cpus currently available (AFAIU the result of the
> cpuset and sched affinity).
> 
> What I really mean here is the maximum value for possible cpus which can 
> be hotplugged into the system. So it's not the maximum number of 
> possible CPUs per se, but rather the maximum enabled bit in the possible 
> CPUs mask.
> 
> Note that we could express this differently as well: rather than saying 
> that it guarantees a value in the range [0, nr_possible_cpus - 1], we 
> could say that the values are guaranteed to be part of the possible cpus 
> mask, which would actually more accurate in case the possible cpus mask 
> has a hole (it tends to happen with things like lxc containers nowadays).
> 
> Do you agree that we should favor expressing this in terms of belonging 
> to the possible cpumask set rather than a range starting from 0 ?

On 2/15/23 18:12, Mathieu Desnoyers wrote:
> Actually, the field may contain the value 0 even if 0 is not part of the 
> possible cpumask. So forget what I just said about being guaranteed to 
> be part of the possible cpus mask.
> 
> Thoughts ?
> 
> Thanks,
> 
> Mathieu

I don't have a full understanding, so I will trust you for deciding what is
best.  I'll try to understand it, but my kernel knowledge is rather limited :)

I suggest writing a detailed description, instead of (or complementary to it)
just using a range, since readers might wonder as I did, what nr_possible_cpus
is (it's not really described anywhere so far).  With a worded description,
we can later improve it if we find it not clear enough, but should be enough
for an initial page.

Thanks!

Alex

> 
> Thanks,
> 
> Mathieu
> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
Attachment:
OpenPGP_signature

Description: OpenPGP digital signature