Thanks again for your replay, anyway I'm still confused. See inline
comments.
Pei Lin wrote:
2010/5/31 luca ellero <lroluk@xxxxxxxxx>:
Pei Lin wrote:
2010/5/17 luca ellero <lroluk@xxxxxxxxx>:
Hi list,
I have some (maybe stupid) questions which I can't answer even after
reading
lots of documentation.
Suppose I have a PCI device which has some I/O registers mapped to memory
(here I mean access are made through memory, not I/O space).
As far as I know the right way to access them is through functions such
as
iowrite8 and friends:
spin_lock(Q)
iowrite8(some_address, ADDR)
iowrite8(some_data, DATA);
spin_unlock(Q);
My questions are:
1) Do I need a write memory barrier (wmb) between the two iowrite8?
I think I need it because I've read the implementation of iowrite8 and
(in
kernel 2.6.30.6) this expands to:
void iowrite8(u8 val, void *addr)
{
do {
unsigned long port = (unsigned long )addr;
if (port >= 0x40000UL) {
writeb(val, addr);
} else if (port > 0x10000UL) {
port &= 0x0ffffUL;
outb(val,port);
} else bad_io_access(port, "outb(val,port)" );
} while (0);
}
where writeb is:
static inline void writeb(unsigned char val, volatile void *addr) {
asm volatile("movb %0,%1":
:"q" (val), "m" (*(volatile unsigned char *)addr)
:"memory");
}
which contains only a compiler barrier (the :"memory" in the asm
statement)
but no CPU barrier. So, without wmb(), CPU can reorder the iowrite8 with
disastrous effect. Am I right?
2) do I need mmiowb() before spin_unlock()?
Documentation about mmiowb() is really confusing me, so any explanation
about his use is really welcome.
See the documentation which explains it clearly.
http://lxr.linux.no/linux+v2.6.27.46/Documentation/memory-barriers.txt
1295LOCKS VS I/O ACCESSES
1296---------------------
1297
1298Under certain circumstances (especially involving NUMA), I/O accesses
within
1299two spinlocked sections on two different CPUs may be seen as
interleaved by the
1300PCI bridge, because the PCI bridge does not necessarily participate in
the
1301cache-coherence protocol, and is therefore incapable of issuing the
required
1302read memory barriers.
1303
1304For example:
1305
1306 CPU 1 CPU 2
1307 ===============================
===============================
1308 spin_lock(Q)
1309 writel(0, ADDR)
1310 writel(1, DATA);
1311 spin_unlock(Q);
1312 spin_lock(Q);
1313 writel(4, ADDR);
1314 writel(5, DATA);
1315 spin_unlock(Q);
1316
1317may be seen by the PCI bridge as follows:
1318
1319 STORE *ADDR = 0, STORE *ADDR = 4, STORE *DATA = 1, STORE *DATA
= 5
1320
1321which would probably cause the hardware to malfunction.
1322
1323
1324What is necessary here is to intervene with an mmiowb() before
dropping the
1325spinlock, for example:
1326
1327 CPU 1 CPU 2
1328 ===============================
===============================
1329 spin_lock(Q)
1330 writel(0, ADDR)
1331 writel(1, DATA);
1332 mmiowb();
1333 spin_unlock(Q);
1334 spin_lock(Q);
1335 writel(4, ADDR);
1336 writel(5, DATA);
1337 mmiowb();
1338 spin_unlock(Q);
1339
1340this will ensure that the two stores issued on CPU 1 appear at the
PCI bridge
1341before either of the stores issued on CPU 2.
1342
1343
1344Furthermore, following a store by a load from the same device
obviates the need
1345for the mmiowb(), because the load forces the store to complete
before the load
1346is performed:
1347
1348 CPU 1 CPU 2
1349 ===============================
===============================
1350 spin_lock(Q)
1351 writel(0, ADDR)
1352 a = readl(DATA);
1353 spin_unlock(Q);
1354 spin_lock(Q);
1355 writel(4, ADDR);
1356 b = readl(DATA);
1357 spin_unlock(Q);
1358
1359
1360See Documentation/DocBook/deviceiobook.tmpl for more information.
Thanks for your reply,
I've already read the documentation, anyway what surprises me is the fact
that mmiowb() (at least on x86) is defined as a compiler barrier (barrier())
and nothing else. I would expect it to do something more than that: some
specific PCI command or at least a dummy "read" from some PCI register
(since a read forces the store to complete).
As for MIPS, it defined as
/* Depends on MIPS II instruction set */
#define mmiowb() asm volatile ("sync" ::: "memory") .
For X86
#define mb() asm volatile("mfence":::"memory")
#define rmb() asm volatile("lfence":::"memory")
#define wmb() asm volatile("sfence" ::: "memory")
For x86, use the mfence/lfence/sfence pair to guarantee it.
That's not true. I confirm you my previous assertion . On x86, mmiowb
doesn't use any mfence/lfence/sfence, it's only a compiler barrier:
See arch\x86\include\asm\io.h:
#define mmiowb() barrier()
i found an old mail discussion for the mmiowb() usage.
http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21056.html
http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21096.html
From: Nick Piggin <npiggin_at_suse.de>
Date: 2007-08-24 12:59:16
On Thu, Aug 23, 2007 at 09:16:42AM -0700, Linus Torvalds wrote:
On Thu, 23 Aug 2007, Nick Piggin wrote:
Also, FWIW, there are some advantages of deferring the mmiowb thingy
until the point of unlock.
And that is exactly what ppc64 does.
But you're missing a big point: for 99.9% of all hardware, mmiowb() is a
total no-op. So when you talk about "advantages", you're not talking about
any *real* advantage, are you?
Furthermore a lot of PCI drivers seem to ignore its use.
Can you explain me that?
i only got one linker which may explain why many driver removed the mmiowb().
http://lwn.net/Articles/283776/
As far as I can see in 2.6.33 code, this patch was not applied in
vanilla kernel source. So that's not the point.
Regards
Luca
--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ