Re: Differences between man-pages and libc manual safety markings

Torvald Riegel <triegel@xxxxxxxxxx> · Sat, 01 Nov 2014 11:47:47 +0100

On Sat, 2014-11-01 at 06:48 -0200, Alexandre Oliva wrote:
> On Oct 30, 2014, Torvald Riegel <triegel@xxxxxxxxxx> wrote:
> 
> > On Thu, 2014-10-30 at 16:24 -0200, Alexandre Oliva wrote:
> >> >> > The hardware requires synchronizing accesses, and just the mere presence
> >> >> > of a data race may lead to undefined behavior of the program.
> >> 
> >> Sorry, but “undefined behavior” is standardese for “don't do that”.
> 
> > It's don't do that for a reason, not just don't do that and you'll be
> > fine.
> 
> Yup.  But remember, it's users of the standard we implement that are not
> supposed to do that.  We can and often do get away with such stuff as
> part of the implementation of the standard.  There's a long history of
> doing so: remember when we implemented mutexes without standard atomics?
> Nowadays, you might look at them and find those implementations
> disgusting, and think “how the heck nobody thought of documenting the
> reliance of this code on certain memory model properties that no longer
> hold?”  The obvious answer is that, back then, such properties were
> perfectly normal and they had no idea something else might take over in
> the distant future.  The point I'm trying to make is that there's only
> so much future-proofness you can put into this sort of documentation.

I don't disagree with this in general.  But in the concrete case we're
talking about, it's really not that hard.  Does strcpy need to consider
that there are concurrent accesses and that it has to do achieve
something under concurrent execution, or does it not?

It's not surprising that this matters today (ie, when you made the
choices), and it's not like we've been aware of this since just
yesterday.

> The most difficult bits to document are not those that are surprising as
> of the writing of the docs, but those that are blatantly obvious to
> pretty much anyone at that time, but that over time become surprising.
> Historians run into lots of walls related with this sort of implicit
> knowledge of a time.

That's why I'm arguing for being conservative: Be a little cautious with
what you consider obvious.  I definitely agree that one can't be perfect
with that but, for example, it's a clear difference whether you
implement on an additional implementation property or can just rely on
the sequential contract of the function.

> > Please try to understand the issue.
> 
> I do.  It's very clear to me.  You wanted and hoped me to do a lot of
> work that I was not supposed to do, and I didn't do it, in part because
> I have little hope of seeing the future as well as you claim to be able
> to.

I don't asked you to know about everything that happens in the future.
Because that will be hard, as you say.  But the result of this is that
it helps to be cautious when making assumptions about things that may
easily change in the future and that you can't predict.

IOW, when you can't easily predict future implementations, be
conservative when making assumptions about them.  Or at least document
that.

> > How is that supposed to work if you haven't documented all the
> > assumptions you made (i.e., if ctermid is not just an outlier)?
> 
> It is, but if I were to document all the assumptions I made, I'd have to
> write several books of assumptions, encoding all the knowledge I've
> accumulated about how past and present hardware architectures work, any
> one of which might change in future architectures.

I don't think it's that hard.  Coming back to the "being conservative"
point, if you feel like you have to write a book about the assumptions
you make that your code (or your documentation, annotations, ...) rely
on, then maybe it's better to take a step back and do not make those
assumptions in the first place.

In our case here, if you feel like what you require from the strcpy
implementation is very complex, perhaps just not make the requirement
and tag ctermid as unsafe?

Or, don't go for specifying assumptions about strcpy in the ctermid
docs, but rather try to solve it at the other end by documenting that
strcpy has to work well under concurrent execution, in particular under
concurrent but "idempotent" copies to a memory range.

> > what assumptions you made beyond the contracts of functions
> 
> Heh.  It's almost funny that you talk about the contracts of functions,
> when you yourself claimed their definitions were not clear enough to
> figure out what the precise requirements were.  Please make up your mind
> about that point before wasting more my time, will you?

I never said that the sequential contract of strcpy would be incomplete
or wrong in some way.  I said that the MT-Safety definition needs
improvement.
When you make the assumption that it has to work even under concurrent
accesses to the destination string, you go beyond the sequential
contract of the function.  You specified it as MT-Safe, but in your
specification that means that the caller-provided data is supposed to be
protected from concurrent accesses by the caller.  So, your assumption
still conflicts with the contract when taking MT-Safe docs into account.
Thus, what I said was not inconsistent.

--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html