On Wednesday 28 September 2011 18:44:25 Jeremy Fitzhardinge wrote: > On 09/28/2011 06:58 AM, Stephan Diestelhorst wrote: > >> I guess it comes down to throwing myself on the efficiency of some kind > >> of fence instruction. I guess an lfence would be sufficient; is that > >> any more efficient than a full mfence? > > An lfence should not be sufficient, since that essentially is a NOP on > > WB memory. You really want a full fence here, since the store needs to > > be published before reading the lock with the next load. > > The Intel manual reads: > > Reads cannot pass earlier LFENCE and MFENCE instructions. > Writes cannot pass earlier LFENCE, SFENCE, and MFENCE instructions. > LFENCE instructions cannot pass earlier reads. > > Which I interpreted as meaning that an lfence would prevent forwarding. > But I guess it doesn't say "lfence instructions cannot pass earlier > writes", which means that the lfence could logically happen before the > write, thereby allowing forwarding? Or should I be reading this some > other way? Indeed. You are reading this the right way. > >> Could you give me a pointer to AMD's description of the ordering rules? > > They should be in "AMD64 Architecture Programmer's Manual Volume 2: > > System Programming", Section 7.2 Multiprocessor Memory Access Ordering. > > > > http://developer.amd.com/documentation/guides/pages/default.aspx#manuals > > > > Let me know if you have some clarifying suggestions. We are currently > > revising these documents... > > I find the English descriptions of these kinds of things frustrating to > read because of ambiguities in the precise meaning of words like "pass", > "ahead", "behind" in these contexts. I find the prose useful to get an > overview, but when I have a specific question I wonder if something more > formal would be useful. It would be, and some have started this efort: http://www.cl.cam.ac.uk/~pes20/weakmemory/ But I am not sure whether that particular nasty forwarding case is captured properly in their model It is on my list of things to check. > I guess it's implied that anything that is not prohibited by the > ordering rules is allowed, but it wouldn't hurt to say it explicitly. > That said, the AMD description seems clearer and more explicit than the > Intel manual (esp since it specifically discusses the problem here). Thanks! Glad you like it :) Stephan -- Stephan Diestelhorst, AMD Operating System Research Center stephan.diestelhorst@xxxxxxx, Tel. +49 (0)351 448 356 719 Advanced Micro Devices GmbH Einsteinring 24 85609 Aschheim Germany Geschaeftsfuehrer: Alberto Bozzo; Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632, WEEE-Reg-Nr: DE 12919551 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html