2010/5/31 luca ellero <lroluk@xxxxxxxxx>: > Pei Lin wrote: >> >> 2010/5/17 luca ellero <lroluk@xxxxxxxxx>: >> >>> >>> Hi list, >>> I have some (maybe stupid) questions which I can't answer even after >>> reading >>> lots of documentation. >>> Suppose I have a PCI device which has some I/O registers mapped to memory >>> (here I mean access are made through memory, not I/O space). >>> As far as I know the right way to access them is through functions such >>> as >>> iowrite8 and friends: >>> >>> spin_lock(Q) >>> iowrite8(some_address, ADDR) >>> iowrite8(some_data, DATA); >>> spin_unlock(Q); >>> >>> My questions are: >>> >>> 1) Do I need a write memory barrier (wmb) between the two iowrite8? >>> I think I need it because I've read the implementation of iowrite8 and >>> (in >>> kernel 2.6.30.6) this expands to: >>> >>> void iowrite8(u8 val, void *addr) >>> { >>> do { >>> unsigned long port = (unsigned long )addr; >>> if (port >= 0x40000UL) { >>> writeb(val, addr); >>> } else if (port > 0x10000UL) { >>> port &= 0x0ffffUL; >>> outb(val,port); >>> } else bad_io_access(port, "outb(val,port)" ); >>> } while (0); >>> } >>> >>> where writeb is: >>> >>> static inline void writeb(unsigned char val, volatile void *addr) { >>> asm volatile("movb %0,%1": >>> :"q" (val), "m" (*(volatile unsigned char *)addr) >>> :"memory"); >>> } >>> >>> which contains only a compiler barrier (the :"memory" in the asm >>> statement) >>> but no CPU barrier. So, without wmb(), CPU can reorder the iowrite8 with >>> disastrous effect. Am I right? >>> >>> >>> 2) do I need mmiowb() before spin_unlock()? >>> Documentation about mmiowb() is really confusing me, so any explanation >>> about his use is really welcome. >>> >> >> See the documentation which explains it clearly. >> http://lxr.linux.no/linux+v2.6.27.46/Documentation/memory-barriers.txt >> >> 1295LOCKS VS I/O ACCESSES >> 1296--------------------- >> 1297 >> 1298Under certain circumstances (especially involving NUMA), I/O accesses >> within >> 1299two spinlocked sections on two different CPUs may be seen as >> interleaved by the >> 1300PCI bridge, because the PCI bridge does not necessarily participate in >> the >> 1301cache-coherence protocol, and is therefore incapable of issuing the >> required >> 1302read memory barriers. >> 1303 >> 1304For example: >> 1305 >> 1306 CPU 1 CPU 2 >> 1307 =============================== >> =============================== >> 1308 spin_lock(Q) >> 1309 writel(0, ADDR) >> 1310 writel(1, DATA); >> 1311 spin_unlock(Q); >> 1312 spin_lock(Q); >> 1313 writel(4, ADDR); >> 1314 writel(5, DATA); >> 1315 spin_unlock(Q); >> 1316 >> 1317may be seen by the PCI bridge as follows: >> 1318 >> 1319 STORE *ADDR = 0, STORE *ADDR = 4, STORE *DATA = 1, STORE *DATA >> = 5 >> 1320 >> 1321which would probably cause the hardware to malfunction. >> 1322 >> 1323 >> 1324What is necessary here is to intervene with an mmiowb() before >> dropping the >> 1325spinlock, for example: >> 1326 >> 1327 CPU 1 CPU 2 >> 1328 =============================== >> =============================== >> 1329 spin_lock(Q) >> 1330 writel(0, ADDR) >> 1331 writel(1, DATA); >> 1332 mmiowb(); >> 1333 spin_unlock(Q); >> 1334 spin_lock(Q); >> 1335 writel(4, ADDR); >> 1336 writel(5, DATA); >> 1337 mmiowb(); >> 1338 spin_unlock(Q); >> 1339 >> 1340this will ensure that the two stores issued on CPU 1 appear at the >> PCI bridge >> 1341before either of the stores issued on CPU 2. >> 1342 >> 1343 >> 1344Furthermore, following a store by a load from the same device >> obviates the need >> 1345for the mmiowb(), because the load forces the store to complete >> before the load >> 1346is performed: >> 1347 >> 1348 CPU 1 CPU 2 >> 1349 =============================== >> =============================== >> 1350 spin_lock(Q) >> 1351 writel(0, ADDR) >> 1352 a = readl(DATA); >> 1353 spin_unlock(Q); >> 1354 spin_lock(Q); >> 1355 writel(4, ADDR); >> 1356 b = readl(DATA); >> 1357 spin_unlock(Q); >> 1358 >> 1359 >> 1360See Documentation/DocBook/deviceiobook.tmpl for more information. >> > > Thanks for your reply, > I've already read the documentation, anyway what surprises me is the fact > that mmiowb() (at least on x86) is defined as a compiler barrier (barrier()) > and nothing else. I would expect it to do something more than that: some > specific PCI command or at least a dummy "read" from some PCI register > (since a read forces the store to complete). As for MIPS, it defined as /* Depends on MIPS II instruction set */ #define mmiowb() asm volatile ("sync" ::: "memory") . For X86 #define mb() asm volatile("mfence":::"memory") #define rmb() asm volatile("lfence":::"memory") #define wmb() asm volatile("sfence" ::: "memory") For x86, use the mfence/lfence/sfence pair to guarantee it. i found an old mail discussion for the mmiowb() usage. http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21056.html http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21096.html From: Nick Piggin <npiggin_at_suse.de> Date: 2007-08-24 12:59:16 On Thu, Aug 23, 2007 at 09:16:42AM -0700, Linus Torvalds wrote: > > > On Thu, 23 Aug 2007, Nick Piggin wrote: > > > > Also, FWIW, there are some advantages of deferring the mmiowb thingy > > until the point of unlock. > > And that is exactly what ppc64 does. > > But you're missing a big point: for 99.9% of all hardware, mmiowb() is a > total no-op. So when you talk about "advantages", you're not talking about > any *real* advantage, are you? > Furthermore a lot of PCI drivers seem to ignore its use. > Can you explain me that? i only got one linker which may explain why many driver removed the mmiowb(). http://lwn.net/Articles/283776/ > Luca > > > -- Best Regards Lin -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ