[Fwd: Re: I/O and memory barriers]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi list,

I had no response here, so I made some research.
If someone is interested, here I reply to myself ;-)
It was especially useful reading Intel manual:
"The Intel Architecture Software Developer Manual (Vol3) - System Programming Guide"

See inline comments.

luca ellero wrote:
Hi list,
I have some (maybe stupid) questions which I can't answer even after reading lots of documentation. Suppose I have a PCI device which has some I/O registers mapped to memory (here I mean access are made through memory, not I/O space). As far as I know the right way to access them is through functions such as iowrite8 and friends:

spin_lock(Q)
iowrite8(some_address, ADDR)
iowrite8(some_data, DATA);
spin_unlock(Q);

My questions are:

1) Do I need a write memory barrier (wmb) between the two iowrite8?
I think I need it because I've read the implementation of iowrite8 and (in kernel 2.6.30.6) this expands to:

void iowrite8(u8 val, void *addr)
{
   do {
       unsigned long port = (unsigned long )addr;
       if (port >= 0x40000UL) {
           writeb(val, addr);
       } else if (port > 0x10000UL) {
           port &= 0x0ffffUL;
           outb(val,port);
       } else bad_io_access(port, "outb(val,port)" );
   } while (0);
}

where writeb is:

static inline void writeb(unsigned char val, volatile void *addr) {
   asm volatile("movb %0,%1":
       :"q" (val), "m" (*(volatile unsigned char *)addr)
       :"memory");
}

which contains only a compiler barrier (the :"memory" in the asm statement) but no CPU barrier. So, without wmb(), CPU can reorder the iowrite8 with disastrous effect. Am I right?

It depends how you mapped the I/O region. If you mapped it using ioremap_nocache (and that is the right way to map MMIO registers) you don't need CPU barrier, otherwise if you mapped it with ioremap you have to use it. That's because ioremap_nocache sets memory type to "uncached" in its PAT (Page Attribute Table). Memory of type "uncached" has the peculiarity that accesses to it are guaranteed to NOT be re-ordered by CPU.


2) do I need mmiowb() before spin_unlock()?
Documentation about mmiowb() is really confusing me, so any explanation about his use is really welcome.

Yes, you need it. It's well explained in memory-barriers.txt in kernel Documentation directory ("LOCKS VS I/O ACCESSES")
Anyway a lot of PCI drivers seem to ignore it. Do that lead to bugs?

Comments or corrections are really welcome.

Best regards
Luca


--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux