On Mon, Feb 13, 2023 at 8:57 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, Feb 13, 2023 at 07:36:42PM -0500, Joel Fernandes wrote: > > Thanks, I agree with most of your last email, just replying to one thing: > > > > > > ->rf does because of data flow causality, ->ppo does because of > > > > program structure, so that makes sense to be ->hb. > > > > > > > > IMHO, ->rfi should as well, because it is embodying a flow of data, so > > > > that is a bit confusing. It would be great to clarify more perhaps > > > > with an example about why ->rfi cannot be ->hb, in the > > > > "happens-before" section. > > > > > > Maybe. We do talk about store forwarding, and in fact the ppo section > > > already says: > > > > > > ------------------------------------------------------------------------ > > > R ->dep W ->rfi R', > > > > > > where the dep link can be either an address or a data dependency. In > > > this situation we know it is possible for the CPU to execute R' before > > > W, because it can forward the value that W will store to R'. > > > ------------------------------------------------------------------------ > > > > Thank you for pointing this out! In the text that follows this, in > > this paragraph: > > > > <quote> > > where the dep link can be either an address or a data dependency. In > > this situation we know it is possible for the CPU to execute R' before > > W, because it can forward the value that W will store to R'. But it > > cannot execute R' before R, because it cannot forward the value before > > it knows what that value is, or that W and R' do access the same > > location. > > </quote> > > > > The "in this situation" should be clarified that the "situation" is a > > data-dependency. Only in the case of data-dependency, the ->rfi > > cannot cause misordering if I understand it correctly. However, that > > sentence does not mention data-dependency explicitly. Or let me know > > if I missed something? > > The text explicitly says that the dep link can be either an address or a > data dependency. In either case, R' cannot be reordered before R. > > In theory this doesn't have to be true for address dependencies, because > the CPU might realize that W and R' access the same address without > knowing what that address is. However, I've been reliably informed that > no existing architectures do this sort of optimization. > > The case of a control dependency is different, because the CPU can > speculate that W will be executed and can speculatively forward the > value from W to R' before it knows what value R will read. > Sorry, I misread it. You are right. Got it now, Thanks. - Joel