Hi! On Mon, Jun 21, 2021 at 04:11:04PM +0200, Christophe Leroy wrote: > Le 19/06/2021 à 17:02, Segher Boessenkool a écrit : > >The point of the twi in the I/O accessors was to make things easier to > >debug if the accesses fail: for the twi insn to complete the load will > >have to have completed as well. On a correctly working system you never > >should need this (until something fails ;-) ) > > > >Without the twi you might need to enforce ordering in some cases still. > >The twi is a very heavy hammer, but some of that that gives us is no > >doubt actually needed. > > Well, I've always been quite perplex about that. According to the > documentation of the 8xx, if a bus error or something happens on an I/O > access, the exception will be accounted on the instruction which does the > access. But based on the following function, I understand that some version > of powerpc do generate the trap on the instruction which was being executed > at the time the I/O access failed, not the instruction that does the access > itself ? Trap instructions are never speculated (this may not be architectural, but it is true on all existing implementations). So the instructions after the twi;isync will not execute until the twi itself has finished, and that cannot happen before the preceding load has (because it uses the loaded register). Now, some I/O accesses can cause machine checks. Machine checks are asynchronous and can be hard to correlate to specific load insns, and worse, you may not even have the address loaded from in architected registers anymore. Since I/O accesses often take *long*, tens or even hundreds of cycles is not unusual, this can be a challenge. To recover from machine checks you typically need special debug hardware and/or software. For the Apple machines those are not so easy to come by. This "twi after loads" thing made it pretty easy to figure out where your code was going wrong. And it isn't as slow as it may sound: typically you really need to have the result of the load before you can go on do useful work anyway, and loads from I/O are slow non-posted things. > /* > * I/O accesses can cause machine checks on powermacs. > * Check if the NIP corresponds to the address of a sync > * instruction for which there is an entry in the exception > * table. > * -- paulus. > */ I suspect this is from before the twi thing was added? > It is not only the twi which bother's me in the I/O accessors but also the > sync/isync and stuff. > > A write typically is > > sync > stw > > A read is > > sync > lwz > twi > isync > > Taking into account that HW ordering is garanteed by the fact that __iomem > is guarded, Yes. But machine checks are asynchronous :-) > isn't the 'memory' clobber enough as a barrier ? A "memory" clobber isn't a barrier of any kind. "Compiler barriers" do not exist. The only thing such a clobber does is it tells the compiler that this inline asm can access some memory, and we do not say at what address. So the compiler cannot reorder this asm with other memory accesses. It has no other effects, no magical effects, and it is not comparable to actual barrier instructions (that actually tell the hardware that some certain ordering is required). "Compiler barrier" is a harmful misnomer: language shapes thoughts, using misleading names causes misguided thoughts. Anyway :-) The isync is simply to make sure the code after it does not start before the code before it has completed. The sync before I am not sure. Segher