On Thu, Jul 2, 2020 at 10:55 AM Will Deacon <will@xxxxxxxxxx> wrote: > On Thu, Jul 02, 2020 at 10:43:55AM -0400, Joel Fernandes wrote: > > On Tue, Jun 30, 2020 at 1:38 PM Will Deacon <will@xxxxxxxxxx> wrote: > > > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h > > > index 92ec486a4f9e..2ecd068d91d1 100644 > > > --- a/arch/alpha/include/asm/barrier.h > > > +++ b/arch/alpha/include/asm/barrier.h > > > - * For example, the following code would force ordering (the initial > > > - * value of "a" is zero, "b" is one, and "p" is "&a"): > > > - * > > > - * <programlisting> > > > - * CPU 0 CPU 1 > > > - * > > > - * b = 2; > > > - * memory_barrier(); > > > - * p = &b; q = p; > > > - * read_barrier_depends(); > > > - * d = *q; > > > - * </programlisting> > > > - * > > > - * because the read of "*q" depends on the read of "p" and these > > > - * two reads are separated by a read_barrier_depends(). However, > > > - * the following code, with the same initial values for "a" and "b": > > > - * > > > > Would it be Ok to keep this example in the kernel sources? I think it > > serves as good documentation and highlights the issue in the Alpha > > architecture well. > > I'd _really_ like to remove it, as I think it only serves to confuse people > on a topic that is confusing enough already. Paul's perfbook [1] already has > plenty of information about this, so I don't think we need to repeat that > here. I could add a citation, perhaps? True, and also found that LKMM docs and the memory-barriers.txt talks about it, so removing it here sounds good to me. Maybe a reference here to either documentation should be Ok. > > BTW, do you know any architecture where speculative execution of > > address-dependent loads can cause similar misorderings? That would be > > pretty insane though. In Alpha's case it is not speculation but rather > > the split local cache design as the docs mention. The reason I ask > > is it is pretty amusing that control-dependent loads do have such > > misordering issues due to speculative branch execution and I wondered > > what other games the CPUs are playing. FWIW I ran into [1] which talks > > about analogy between memory dependence and control dependence. > > I think you're asking about value prediction, and the implications it would > have on address-dependent loads where the address can itself be predicted. Yes. > I'm not aware of an CPUs where that is observable architecturally. I see. > arm64 has some load instructions that do not honour address dependencies, > but I believe that's mainly to enable alternative cache designs for things > like non-temporal and large vector loads. Good to know this, thanks. - Joel