On Fri, Jul 30, 2021 at 06:20:22PM +0100, Jade Alglave wrote: > I hope this material can help inform this conversation and I would love > to hear your thoughts. More comments... I find the herd-style diagrams (Figures 2, 3, 5, 7, 9, and so on) almost impossible to decipher. While they might be useful to people running herd, they have several drawbacks for readers of this report: They include multiple instructions, not just the one for which you want to illustrate the internal dependencies. How about getting rid of the extraneous instructions? Each box contains three lines of information, of which only the first is really significant, and it is hard to figure out. How about getting rid of the second and third lines, and replacing things like "e: R0:X1q=x" in the first line with something more along the lines of "RegR X0" or "tmp1 = RegR X0"? The "iico" in the dependency arrows doesn't add anything. Section 1.1 mentions order, data, and control Intrinsic dependencies but doesn't give so much as a hint as to what they are. Instead the reader is forced to invent his own generalizations by reading through several complex special-case examples. There should be a short description of what each Intrinsic dependency represents. For instance, the first sentence in 1.3 would be a great way to explain data dependencies. (And is it not true that control dependencies are mainly needed for situations where an instruction's inputs and outputs may include the same register or memory address, when it is necessary to enforce that the input value is read before the output value is written?) Some of the dependencies listed for CAS are surprising, but there is no explanation. Why is C2 a control dependency rather than a data dependency? After all, the value read from [Xn] is stored in Xs in both cases. In fact, Df1 supersedes C2 in the failure case, doesn't it? And why are C1 and Ds1 a control and data dependency respectively rather than both order dependencies? Section 2.1: Although the Store F is independent of the conditional branch and so might be made visible to other observers early, isn't it true that neither ARMv8 nor any other type of processor will do this? General question: How does this discussion of conditional branches relate overall to the way computed branches are handled? Alan