2010/6/7 luca ellero <lroluk@xxxxxxxxx>: > Thanks again for your replay, anyway I'm still confused. See inline > comments. > > Pei Lin wrote: >> >> 2010/5/31 luca ellero <lroluk@xxxxxxxxx>: >> >>> >>> Pei Lin wrote: >>> >>>> >>>> 2010/5/17 luca ellero <lroluk@xxxxxxxxx>: >>>> >>>> >>>>> >>>>> Hi list, >>>>> I have some (maybe stupid) questions which I can't answer even after >>>>> reading >>>>> lots of documentation. >>>>> Suppose I have a PCI device which has some I/O registers mapped to >>>>> memory >>>>> (here I mean access are made through memory, not I/O space). >>>>> As far as I know the right way to access them is through functions such >>>>> as >>>>> iowrite8 and friends: >>>>> >>>>> spin_lock(Q) >>>>> iowrite8(some_address, ADDR) >>>>> iowrite8(some_data, DATA); >>>>> spin_unlock(Q); >>>>> >>>>> My questions are: >>>>> >>>>> 1) Do I need a write memory barrier (wmb) between the two iowrite8? >>>>> I think I need it because I've read the implementation of iowrite8 and >>>>> (in >>>>> kernel 2.6.30.6) this expands to: >>>>> >>>>> void iowrite8(u8 val, void *addr) >>>>> { >>>>> do { >>>>> unsigned long port = (unsigned long )addr; >>>>> if (port >= 0x40000UL) { >>>>> writeb(val, addr); >>>>> } else if (port > 0x10000UL) { >>>>> port &= 0x0ffffUL; >>>>> outb(val,port); >>>>> } else bad_io_access(port, "outb(val,port)" ); >>>>> } while (0); >>>>> } >>>>> >>>>> where writeb is: >>>>> >>>>> static inline void writeb(unsigned char val, volatile void *addr) { >>>>> asm volatile("movb %0,%1": >>>>> :"q" (val), "m" (*(volatile unsigned char *)addr) >>>>> :"memory"); >>>>> } >>>>> >>>>> which contains only a compiler barrier (the :"memory" in the asm >>>>> statement) >>>>> but no CPU barrier. So, without wmb(), CPU can reorder the iowrite8 >>>>> with >>>>> disastrous effect. Am I right? >>>>> >>>>> >>>>> 2) do I need mmiowb() before spin_unlock()? >>>>> Documentation about mmiowb() is really confusing me, so any explanation >>>>> about his use is really welcome. >>>>> >>>>> >>>> >>>> See the documentation which explains it clearly. >>>> http://lxr.linux.no/linux+v2.6.27.46/Documentation/memory-barriers.txt >>>> >>>> 1295LOCKS VS I/O ACCESSES >>>> 1296--------------------- >>>> 1297 >>>> 1298Under certain circumstances (especially involving NUMA), I/O >>>> accesses >>>> within >>>> 1299two spinlocked sections on two different CPUs may be seen as >>>> interleaved by the >>>> 1300PCI bridge, because the PCI bridge does not necessarily participate >>>> in >>>> the >>>> 1301cache-coherence protocol, and is therefore incapable of issuing the >>>> required >>>> 1302read memory barriers. >>>> 1303 >>>> 1304For example: >>>> 1305 >>>> 1306 CPU 1 CPU 2 >>>> 1307 =============================== >>>> =============================== >>>> 1308 spin_lock(Q) >>>> 1309 writel(0, ADDR) >>>> 1310 writel(1, DATA); >>>> 1311 spin_unlock(Q); >>>> 1312 spin_lock(Q); >>>> 1313 writel(4, ADDR); >>>> 1314 writel(5, DATA); >>>> 1315 spin_unlock(Q); >>>> 1316 >>>> 1317may be seen by the PCI bridge as follows: >>>> 1318 >>>> 1319 STORE *ADDR = 0, STORE *ADDR = 4, STORE *DATA = 1, STORE >>>> *DATA >>>> = 5 >>>> 1320 >>>> 1321which would probably cause the hardware to malfunction. >>>> 1322 >>>> 1323 >>>> 1324What is necessary here is to intervene with an mmiowb() before >>>> dropping the >>>> 1325spinlock, for example: >>>> 1326 >>>> 1327 CPU 1 CPU 2 >>>> 1328 =============================== >>>> =============================== >>>> 1329 spin_lock(Q) >>>> 1330 writel(0, ADDR) >>>> 1331 writel(1, DATA); >>>> 1332 mmiowb(); >>>> 1333 spin_unlock(Q); >>>> 1334 spin_lock(Q); >>>> 1335 writel(4, ADDR); >>>> 1336 writel(5, DATA); >>>> 1337 mmiowb(); >>>> 1338 spin_unlock(Q); >>>> 1339 >>>> 1340this will ensure that the two stores issued on CPU 1 appear at the >>>> PCI bridge >>>> 1341before either of the stores issued on CPU 2. >>>> 1342 >>>> 1343 >>>> 1344Furthermore, following a store by a load from the same device >>>> obviates the need >>>> 1345for the mmiowb(), because the load forces the store to complete >>>> before the load >>>> 1346is performed: >>>> 1347 >>>> 1348 CPU 1 CPU 2 >>>> 1349 =============================== >>>> =============================== >>>> 1350 spin_lock(Q) >>>> 1351 writel(0, ADDR) >>>> 1352 a = readl(DATA); >>>> 1353 spin_unlock(Q); >>>> 1354 spin_lock(Q); >>>> 1355 writel(4, ADDR); >>>> 1356 b = readl(DATA); >>>> 1357 spin_unlock(Q); >>>> 1358 >>>> 1359 >>>> 1360See Documentation/DocBook/deviceiobook.tmpl for more information. >>>> >>>> >>> >>> Thanks for your reply, >>> I've already read the documentation, anyway what surprises me is the fact >>> that mmiowb() (at least on x86) is defined as a compiler barrier >>> (barrier()) >>> and nothing else. I would expect it to do something more than that: some >>> specific PCI command or at least a dummy "read" from some PCI register >>> (since a read forces the store to complete). >>> >> >> As for MIPS, it defined as >> /* Depends on MIPS II instruction set */ >> #define mmiowb() asm volatile ("sync" ::: "memory") . >> >> For X86 >> #define mb() asm volatile("mfence":::"memory") >> #define rmb() asm volatile("lfence":::"memory") >> #define wmb() asm volatile("sfence" ::: "memory") >> For x86, use the mfence/lfence/sfence pair to guarantee it. >> > > That's not true. I confirm you my previous assertion . On x86, mmiowb > doesn't use any mfence/lfence/sfence, it's only a compiler barrier: look at the e-mail i provided. "Now, on x86, the CPU actually tends to order IO writes *more* than it orders any other writes (they are mostly entirely synchronous, unless the area has been marked as write merging), but at least on PPC, it's the other way around: without the cache as a serialization entry, you end up having a totally separate queueu to serialize, and a regular-memory write barrier does nothing at all to the IO queue." So on X86, mmiowb only define to asm volatile (" " ::: "memory") . Another word for it, x86 can guarantee the order for IO writes, i think. > See arch\x86\include\asm\io.h: > #define mmiowb() barrier() > > >> i found an old mail discussion for the mmiowb() usage. >> http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21056.html >> http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21096.html >> From: Nick Piggin <npiggin_at_suse.de> >> Date: 2007-08-24 12:59:16 >> On Thu, Aug 23, 2007 at 09:16:42AM -0700, Linus Torvalds wrote: >> >>> >>> On Thu, 23 Aug 2007, Nick Piggin wrote: >>> >>>> >>>> Also, FWIW, there are some advantages of deferring the mmiowb thingy >>>> until the point of unlock. >>>> >>> >>> And that is exactly what ppc64 does. >>> >>> But you're missing a big point: for 99.9% of all hardware, mmiowb() is a >>> total no-op. So when you talk about "advantages", you're not talking >>> about >>> any *real* advantage, are you? >>> >> >> >> >>> >>> Furthermore a lot of PCI drivers seem to ignore its use. >>> Can you explain me that? >>> >> >> i only got one linker which may explain why many driver removed the >> mmiowb(). >> http://lwn.net/Articles/283776/ >> >> > > As far as I can see in 2.6.33 code, this patch was not applied in vanilla > kernel source. So that's not the point. > Regards > Luca > > > > -- Best Regards Lin -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ