Re: [WIP 0/3] Memory model and atomic API in Rust

Boqun Feng <boqun.feng@xxxxxxxxx> · Mon, 25 Mar 2024 15:38:32 -0700

On Mon, Mar 25, 2024 at 06:09:19PM -0400, Kent Overstreet wrote:
> On Mon, Mar 25, 2024 at 02:37:14PM -0700, Boqun Feng wrote:
> > On Mon, Mar 25, 2024 at 05:14:41PM -0400, Kent Overstreet wrote:
> > > On Mon, Mar 25, 2024 at 12:44:34PM -0700, Linus Torvalds wrote:
> > > > On Mon, 25 Mar 2024 at 11:59, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote:
> > > > >
> > > > > To be fair, "volatile" dates from an era when we didn't have the haziest
> > > > > understanding of what a working memory model for C would look like or
> > > > > why we'd even want one.
> > > > 
> > > > I don't disagree, but I find it very depressing that now that we *do*
> > > > know about memory models etc, the C++ memory model basically doubled
> > > > down on the same "object" model.
> > > > 
> > > > > The way the kernel uses volatile in e.g. READ_ONCE() is fully in line
> > > > > with modern thinking, just done with the tools available at the time. A
> > > > > more modern version would be just
> > > > >
> > > > > __atomic_load_n(ptr, __ATOMIC_RELAXED)
> > 
> > Note that Rust does have something similiar:
> > 
> > 	https://doc.rust-lang.org/std/ptr/fn.read_volatile.html
> > 
> > 	pub unsafe fn read_volatile<T>(src: *const T) -> T
> > 
> > (and also write_volatile()). So they made a good design putting the
> > volatile on the accesses rather than the type. However, per the current
> > Rust memory model these two primitives will be UB when data races happen
> > :-(
> > 
> > I mean, sure, if I use read_volatile() on an enum (whose valid values
> > are only 0, 1, 2), and I get a value 3, and the compiler says "you have
> > a logic bug and I refuse to compile the program correctly", I'm OK. But
> > if I use read_volatile() to read something like a u32, and I know it's
> > racy so my program actually handle that, I don't know any sane compiler
> > would miss-compile, so I don't know why that has to be a UB.
> 
> Well, if T is too big to read/write atomically then you'll get torn
> reads, including potentially a bit representation that is not a valid T.
> 
> Which is why the normal read_volatile<> or Volatile<> should disallow
> that.
> 

Well, why a racy read_volatile<> is UB on a T who is valid for all bit
representations is what I was complaining about ;-)

> > > where T is any type that fits in a machine word, and the only operations
> > > it supports are get(), set(), xchg() and cmpxchG().
> > > 
> > > You DO NOT want it to be possible to transparantly use Volatile<T> in
> > > place of a regular T - in exactly the same way as an atomic_t can't be
> > > used in place of a regular integer.
> > 
> > Yes, this is useful. But no it's not that useful, how could you use that
> > to read another CPU's stack during some debug functions in a way you
> > know it's racy?
> 
> That's a pretty difficult thing to do, because you don't know the
> _layout_ of the other CPU's stack, and even if you do it's going to be
> changing underneath you without locking.
> 

It's a debug function, I don't care whether the data is accurate, I just
want to get much information as possible. This kinda of usage, along
with cases where the alorigthms are racy themselves are the primary
reasons of volatile _accesses_ instead of volatile _types_. For example,
you want to read ahead of a counter protected by a lock:

	if (unlikely(READ_ONCE(cnt))) {
		spin_lock(lock);
		int c = cnt; // update of the cnt is protected by a lock.
		...
	}

because you want to skip the case where cnt == 0 in a hotpath, and you
know someone is going to check this again in some slowpath, so
inaccurate data doesn't matter.

> So the races thare are equivalent to a bad mem::transmute(), and that is
> very much UB.
> 
> For a more typical usage of volatile, consider a ringbuffer with one
> thread producing and another thread consuming. Then you've got head and
> tail pointers, each written by one thread and read by another.
> 
> You don't need any locking, just memory barriers and
> READ_ONCE()/WRITE_ONCE() to update the head and tail pointers. If you
> were writing this in Rust today the easy way would be an atomic integer,
> but that's not really correct - you're not doing atomic operations
> (locked arithmetic), just volatile reads and writes.
> 

Confused, I don't see how Volatile<T> is better than just atomic in this
case, since atomc_load() and atomic_store() are also not locked in any
memory model if lockless implementation is available.

> Volatile<T> would be Send and Sync, just like atomic integers. You don't
> need locking if you're just working with single values that are small
> enough for the machine to read/write atomically.

So to me Volatile<T> can help in the cases where we know some memory is
"external", for example a MMIO address, or ringbuffer between guests and
hypervisor. But it doesn't really fix the missing functionality here:
allow generating a plain "mov" instruction on x86 for example on _any
valid memory_, and programmers can take care of the result.

Regards,
Boqun