Re: Differences between man-pages and libc manual safety markings

Torvald Riegel <triegel@xxxxxxxxxx> · Sat, 01 Nov 2014 20:54:51 +0100

On Sat, 2014-11-01 at 16:22 -0200, Alexandre Oliva wrote:
> On Nov  1, 2014, Torvald Riegel <triegel@xxxxxxxxxx> wrote:
> 
> > On Sat, 2014-11-01 at 06:24 -0200, Alexandre Oliva wrote:
> >> strcpy as it is implemented today makes ctermid MT-Safe.  That's what
> >> the current documentation states, and AFAICT that is correct.  Do you
> >> disagree?  If so, please point out in current code where it deviates.
> 
> > The fault is in the difference between what strcpy is specified to do
> 
> Please point out the part of strcpy's contract that states it can write
> garbage to the destination before writing what it ought to, and then you
> (and Mark) might have a point.

It's a sequential specification, so before/after, the intermediate steps
are unspecified, so it is allowed to do what it wants there.  Within the
rules of C -- so as-if applies here.  And that doesn't disallow strcpy
to prevent writing intermediate states (whether you consider them
garbage or not).

Would you make any assumptions about the stores performed by a sorting
function that is specified to take an array, and return with the array's
elements being sorted?  Would you require it to only write finally
sorted data out, or would you allow it to use the array as scratch space
too?  I guess the latter.  And the same applies to strcpy, you don't
want to restrict whether it copies forwards or backwards, for example.
And you don't have to for sequential code.

> Don't even bother arguing a function might do things not explicitly
> written out in their specs, or I'll suggest functions might release
> locks and take them back at will, just to put things in a familiar
> context in which you'd see how bad that would be.

First, now you're talking about concurrent code, not sequential code.

If we assume we have concurrent code, then you could even release and
acquire locks at will if you can prove that this is still okay under
as-if.  But that's besides the point we're discussing here because...

.. strcpy is not specified as a concurrent function, is it?  And it
doesn't operate on volatile data.  So, yes, it can write other stuff to
the strings it will overwrite eventually with the final results.  Same
as a sorting function might do.  The requirement is about the state
after the function has completed, not while it does something.  That's
why these things are called post-conditions.

> 
> You may then resort to the as-if rule combined with the requirement for
> synchronization points between reads and writes by different threads,
> and that would be correct for *current* standards.  Not so for much
> older standards, that had no such requirements,

If you're referring to pre-C11 C standards, then they had a
single-threaded abstract machine, didn't they?

> and that existing
> programs still target, so we should ideally provide an implementation
> suitable for them.
> 
> 
> So, who changed the contract and put in additional requirements, again?
> 
> > See Mark's comments.  Are you saying that the implementation possibility
> > he mentions is obviously wrong?
> 
> It depends on the targeted standard.  For recent standards, it's a
> legitimate optimization.  For older standards, it would not be.

Unless you say which "older standards" you actually refer to, I can't
interpret your statement.  Because older standards for single-threaded
abstract machines do allow this.

> For
> future ones, I seem to have misplaced my crystal ball .
> 
> > @c This function is a stub by default; the actual implementation, for
> > @c posix systems, returns an internal buffer if passed a NULL string,
> > @c but the internal buffer is always set to /dev/tty.
> 
> > Where does it talk about a potential data race?  It says that's there a
> > shared buffer, but it does not mention that there's something special
> > about how the buffer is initialized.  Nor that this affects strcpy
> > implementations in any way.
> 
> See?, that's what I said about the notes not being in a form you wanted.
> The information is all there: there's a static shared buffer (as
> required by ctermid(NULL), and that's usually not thread-safe, but
> that's tacit knowledge of our time not explicitly duplicated there),

Wow, really?  It supposed to be "tacit knowledge of our time" to not
have thread-safe initialization of a static buffer?  Or have a
pre-initialized static buffer?  "but the internal buffer is always set
to /dev/tty" makes it obvious that there's a data race?  Really?

Maybe we should make a poll on libc-alpha to see which percentage of
people actually understands this comment as implying that there is a
data race.

> and
> it's always set to /dev/tty (and the fact that it uses strcpy is just an
> implementation detail that, after optimization, is not even true any
> more).

The validity of this comment relies on the "implementation detail" and
on other implementation details such as what the compiler does.  So that
doesn't look like "just an implementation detail" to me...

> > And how do you expect strcpy implementers to be aware of this, in
> > practice, with high probability?
> 
> I don't.  You wish they would, and I understand why, but that's all.
> You wish I had solved a documentation problem, but I was under no
> obligation to do so and I had enough on my plate already so I did not
> jump through additional hoops just to satisfy your wishes.

I'm not really sure what to say after reading that.

Also note that "your wishes" isn't what this is about -- it is about
maintainable documentation.  This affects glibc in general.

> > Should they review all the callers and look for undocumented
> > assumptions?
> 
> I'm afraid that's how things have usually been done.
> 
> > Do you think that this scales well and keeps complexity down?
> 
> No.  I don't disagree such documentation is desirable.
> 
> >> Other pervasive cases I'm not sure I mentioned then have to
> >> do with unguarded access to constant locale data, that have only been
> >> annotated when the pointer to the current locale object is accessed more
> >> than once.
> 
> > I can't quite follow that.  Do the accesses constitute data races?  Can
> > you give me some more pointers or background info?
> 
> Look at the implementation of the ctype family of macros/functions, for
> one.  Technically, there are data races in there: the global pointer to
> the current locale object can be modified by other threads and they
> don't take anything like a read lock to ensure they have the current
> pointer and that it's not being modified concurrently.
> 
> However, because of the way locale is implemented, the pointer is
> modified atomically (while holding a write lock, and written as a single
> word to a properly aligned location), so readers in other threads may
> get either the old pointer or the new one.
> 
> The old locale data remains valid forever (till the end of the program),
> so it is safe to keep on using it until some synchronization point or
> whatever else updates the local view of the global pointer.
> 
> Numerous functions access ctype data once; given the above, those are
> never a problem, in spite of the race in the global locale pointer.

I have to look at this in detail, but I'm concerned about two issues
there:
1) Do the loads use an acquire fence (ie, atomic_read_barrier)?  If not,
is any new locale data they can read from initialized before those reads
(e.g., at program startup, so happening before any spawned threads)?
2) If the compiler sees such a load (e.g., because it's a macro, or with
LTO and inlining), and it's not marked as an atomic access, it can be
free to reload it.  Which could lead to partially reading from the old
and new locale.

> Some functions access ctype data multiple times; given the above, and
> because the locale structures are accessed in a way that enable the
> compiler to load the global locale pointer only once, and the compiler
> performs this optimization

So correctness applies in some cases on certain, optional compiler
optimizations to be performed?  That wouldn't be good.

> (or, in some cases, the load is factored out
> explicitly in the source code), these are not a problem, in spite of the
> same race.
> 
> Some functions call multiple functions that each access ctype data, each
> one loading the global locale pointer independently.  These are prone to
> inconsistent behavior, since parts of their execution may use one locale
> while another part may use another locale.  These were consistently
> marked with the “locale” keyword.

Thanks for the additional information.  It's helpful.

> 
> >> That's really all I can think of that you might find
> >> objectionable, but I'm sure there'd be plenty of other cases you'd have
> >> objected to if you had done the review.
> 
> > Or maybe not.  What makes you think that way?
> 
> That you've surprised me again and again with objections to issues I
> hadn't regarded as objectionable.
> 
> > I'm asking to get a feel for what else you might have assumed to be
> > okay that I wouldn't.
> 
> How could I do that without knowing what *your* criteria could be?

You can't a priori.  But we can keep talking, and you can explain more
of your reasoning, and we can further discuss it to see whether there
are differences.  The above is a good start, so I appreciate that.

> >> Now here's something that might make your head spin:
> 
> > Sorry, but why?
> 
> Because we've been talking about strcpy, because my recollection
> incorrectly told me ctermid used the internal, non-preemptible name for
> strcpy, but strcpy happens to be actually irrelevant because it is
> optimized away.
> 
> > So now you depend on the compiler's memcpy implementation.  Great.  We
> > can consider the optimized memcpy Mark mentions.  (strcpy on constant
> > string is strlen (known a priori) + memcpy.)
> 
> Well, no, it's more efficient than that, because it doesn't have to load
> the src: it is a known constant.  It's more like an open-coded array
> initialization.

As I said in my previous email, how to write to the destination matters.

> > Please look again at the case he described.
> 
> Likewise.  No, seriously, I mean, look at the *other* case he described,
> the one that actually matters.  Not the one about invalidating a entire
> cache line, that I've already demonstrated as not applicable to this
> short string, but you get back to with a hypothetical larger string that
> might indeed invalidate the underlying assumptions.

Oh great, 

> I'm sorry to
> disappoint you, but my code review abilities do not cover hypothetical
> and nonsensical changes you might want to suggest just to break
> observations derived from the code on which the documented properties
> were based.
> 
> I was talking about the one that starts out by writing garbage to the
> first byte to trigger the prefetching of that cache line.  The one that
> would break the strcpy contract of old standards, but that is valid
> under current standards' contracts that include the as-if rule

Then please refer me to the concrete old standard (and we should support
it, actually) and the wording in there that *requires* strcpy to issue a
certain set of stores instead of just, for example, requiring that when
the function returns, the string has been copied.

Second, when you've done that, could you please explain to me how a
compiler is supposed to optimize code without something like an as-if
rule?

> and the
> no-data-races mandate.

For there to be data races, the standard must actually acknowledge that
things are supposed to work in a multi-threaded setting.  Could you
refer me to the respective wording too, please?

--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html