Re: dcache locking question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 18, 2019 at 10:11:06AM -0700, Paul E. McKenney wrote:
> On Mon, Mar 18, 2019 at 09:26:18AM -0700, James Bottomley wrote:
> > On Sun, 2019-03-17 at 17:35 -0700, Paul E. McKenney wrote:
> > > On Sat, Mar 16, 2019 at 09:23:16PM -0700, James Bottomley wrote:
> > > > On Sun, 2019-03-17 at 03:06 +0000, Al Viro wrote:
> > > > > On Sat, Mar 16, 2019 at 07:20:20PM -0700, James Bottomley wrote:
> > > > > > On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote:
> > > > > > [...]
> > > > > > >  I -have- seen stores of constant values be torn, but not
> > > > > > > stores of runtime-variable values and not loads.  Still, such
> > > > > > > tearing is permitted, and including the READ_ONCE() is making
> > > > > > > it easier for things like thread sanitizers.  In addition,
> > > > > > > the READ_ONCE() makes it clear that the value being loaded is
> > > > > > > unstable, which can be useful documentation.
> > > > > > 
> > > > > > Um, just so I'm clear, because this assumption permeates all
> > > > > > our code: load or store tearing can never occur if we're doing
> > > > > > load or store of a 32 bit value which is naturally
> > > > > > aligned.  Where naturally aligned is within the gift of the CPU
> > > > > > to determine but which the compiler or kernel will always
> > > > > > ensure for us unless we pack the structure or deliberately
> > > > > > misalign the allocation.
> > > 
> > > A non-volatile store of certain 32-bit constants can and does tear
> > > on some architectures.  These architectures would be the ones with a
> > > store-immediate instruction with a small immediate field, and where
> > > the 32-bit constant is such that a pair of 16-bit immediate store
> > > instructions can store that value.
> > 
> > Understood: PA-RISC is one such architecture: our ldil (load immediate
> > long) can only take 21 bits of immediate data and you have to use a
> > second instruction (ldo) to get the remaining 11 bits. However, the
> > compiler guarantees no tearing in memory visibility for PA by doing the
> > lidl/ldo sequence on a register and then writing the register to memory
> > which I believe is an architectural guarantee.
> 
> Good to know, thank you!
> 
> > > There was a bug in an old version of GCC where even volatile 32-bit
> > > stores of these constants would tear.  They did fix the bug, but it
> > > took some time to find a GCC person who understood that this was in
> > > fact a bug.
> > > 
> > > Hence my preference for READ_ONCE() and WRITE_ONCE() for data-racing
> > > loads and stores.
> > 
> > OK, but didn't everyone eventually agree this was a compiler bug?
> 
> They did agree, but only in the case where the store was volatile,
> as in WRITE_ONCE(), and -not- in the case of a plain store.
> 
> At least the kernel doesn't make general use of vector instructions.
> If it did, I would not be surprised to see compilers use three 32-bit
> vector stores to store to a 32-bit int adjacent to a 64-bit pointer.  :-/

And it turns out that the CPU architecture in question was x86-64, for
whatever that is worth.  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55981

(There is also a later bug report dealing strictly with volatile, but
my search-engine skills are failing me this morning.)

							Thanx, Paul

> > > > > Wait a sec; are there any 64bit architectures where the same is
> > > > > not guaranteed for dereferencing properly aligned void **?
> > > > 
> > > > Yes, naturally alligned void * dereference shouldn't tear
> > > > either.  Iwas just using 32 bit as my example because 64 bit
> > > > accesses will tear on 32 bit architectures but 64 bit naturally
> > > > aligned accesses shouldn't tear on 64 bit architectures.  However,
> > > > since we can't guarantee the 64 bitness of the architecture 32 bit
> > > > or void * is our gold standard for not tearing.
> > > 
> > > For stores of quantities not known at compiler time, agreed.  But
> > > that same store-immediate situation could happen on 64-bit systems.
> > > 
> > > > James
> > > > 
> > > > 
> > > > > If that's the case, I can think of quite a few places that are
> > > > > rather dubious, and I don't see how READ_ONCE() could help in
> > > > > those - e.g. if an architecture only has 32bit loads, rcu list
> > > > > traversals are not going to be doable without one hell of an
> > > > > extra headache.
> > > 
> > > All the 64-bit systems that run the Linux kernel do have 64-bit load
> > > instructions and rcu_dereference() uses READ_ONCE() internally, so we
> > > should be fine with RCU list traverals.
> > 
> > I really don't think it's possible to get the same immediate constant
> > tearing bug on 64 bit.  If you look at PA, we have no 64 bit
> > equivalent of the ldil/ldo pair so all 64 bit immediate stores come
> > straight from the global data table via a register, so no tearing.  I
> > bet every 64 bit architecture has a similar approach because 64 bit
> > immediate data just requires too many bits to stuff into an instruction
> > pair.
> > 
> > James
> > 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux