On Wed, Feb 5, 2014 at 1:22 PM, Martyn Welch <martyn@xxxxxxxxxxxx> wrote: > > > > On 5 February 2014 17:41, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: >> >> On Tue, Feb 04, 2014 at 06:34:13PM +0000, Martyn Welch wrote: >> > On 04/02/14 16:34, Michael Kenney wrote: >> > >Hi Martyn, >> > > >> > >On Tue, Feb 4, 2014 at 7:28 AM, Martyn Welch <martyn.welch@xxxxxx >> > ><mailto:martyn.welch@xxxxxx>> wrote: >> > > >> > > On 28/12/13 00:34, Michael Kenney wrote: >> > > >> > > Hi Martyn, >> > > >> > > On Fri, Dec 27, 2013 at 4:23 PM, Martyn Welch >> > > <martyn@xxxxxxxxxxxx <mailto:martyn@xxxxxxxxxxxx>> wrote: >> > > >> > > On 27/12/13 20:15, Michael Kenney wrote: >> > > >> > > >> > > We are using the vme_tsi148 bridge driver along with >> > > the >> > > vme_user >> > > driver to access the VME boards. The A/D board requires >> > > D32 bus cycles >> > > and the VME master window is configured accordingly, >> > > however, when >> > > monitoring the bus cycles with a logic analyzer, we >> > > noticed that the >> > > CPU is transferring one byte at a time (i.e. four D8 >> > > transfers rather >> > > than one D32). >> > > >> > > Is this the expected behavior of the tsi148 driver? >> > > >> > > >> > > Hi Mike, >> > > >> > > This is certainly not the expected behaviour - if the >> > > window >> > > is configured >> > > for D32 then it should do 32 bit transfers where possible. >> > > >> > > I've heard of this happening recently, but haven't yet been >> > > able to >> > > replicate it. Which VME board are you running Linux on and >> > > which flavour of >> > > Linux? >> > > >> > > >> > > I'm running Debian 7.2 with kernel 3.2 on a Fastwel CPC600 >> > > (Pentium M >> > > based CPU board). >> > > >> > > >> > > I haven't forgotten about this, still not sure exactly what is >> > > happening. >> > > >> > > Is your install/kernel 32 or 64 bit? >> > > >> > > Are you doing single 32-bit transfers, or are you seeing this on >> > > longer transfers (i.e. copying a buffer full of data)? >> > > >> > > >> > >Thanks for getting back to me. >> > > >> > >I'm running a 32-bit kernel and I see this behavior on all transfers >> > >regardless of buffer size. >> > > >> > >> > Gah! Thought I could see what may be causing it in 64-bit kernels, but >> > not >> > 32-bit (my x86 asm is not particularly hot). >> > >> > I think we /may/ be hitting issues with how the memcpy function gets >> > implemented on specific architectures. The tsi148 is a PCI/X to VME >> > bridge, >> > if it receives a series of 8-bit reads or writes, it translates these to >> > 8-bit reads or writes on the VME bus, which is not necessarily what we >> > want. >> > >> > According to the data sheet, the card you are using has an Intel 6300ESB >> > ICH, which seems to be PCI/X (so we can rule out PCIe to PCI/X bridges >> > or >> > something like that doing nasty things). >> > >> > I think (if I follow everything correctly) then the memcpy for 32-bit is >> > handled by __memcpy in arch/x86/include/asm/string_32.h: >> > >> > static __always_inline void *__memcpy(void *to, const void *from, size_t >> > n) >> > { >> > int d0, d1, d2; >> > asm volatile("rep ; movsl\n\t" >> > "movl %4,%%ecx\n\t" >> > "andl $3,%%ecx\n\t" >> > "jz 1f\n\t" >> > "rep ; movsb\n\t" >> > "1:" >> > : "=&c" (d0), "=&D" (d1), "=&S" (d2) >> > : "0" (n / 4), "g" (n), "1" ((long)to), "2" >> > ((long)from) >> > : "memory"); >> > return to; >> > } >> > >> > I'd expected this function to use movl (32-bit moves) where possible, >> > but >> > movsb to get to naturally aligned moves (which is something that we deal >> > with in the VME code already to use 16-bit reads where we can). >> > >> > Greg (as co-maintainer of the VME subsystem :-) ), Am I reading this >> > right? >> >> Yes, but why would it matter if we copy by bytes, bits, or dwords to a >> memory-backed bus? By definition, that shouldn't matter. If it does >> matter, then we shouldn't be using memcpy, or relying on the compiler to >> do the copying, but rather we need to use a write() function instead as >> this would be IO memory that does care about this type of thing. >> > > The drivers are using the functions memcpy_toio and memcpy_fromio for the > bulk of the transfer, these seem to equate to memcpy on x86. > > In this instance it does matter - some devices can only accept certain width > accesses and the VME bridge will translate a 8-bit PCI transfer to a 8-bit > transfer on the VME bus. > > So (and I haven't even compile tested this yet), this?: > > diff --git a/drivers/vme/bridges/vme_ca91cx42.c > b/drivers/vme/bridges/vme_ca91cx42.c > index 1425d22c..0d87ebd 100644 > --- a/drivers/vme/bridges/vme_ca91cx42.c > +++ b/drivers/vme/bridges/vme_ca91cx42.c > @@ -894,9 +894,9 @@ static ssize_t ca91cx42_master_read(struct > vme_master_resource *image, > } > > count32 = (count - done) & ~0x3; > - if (count32 > 0) { > - memcpy_fromio(buf + done, addr + done, (unsigned int)count); > - done += count32; > + while (done < count32) { > + *(u32 *)(buf + done) = ioread32(addr + done); > + done += 4; > } > > if ((count - done) & 0x2) { > @@ -948,9 +948,9 @@ static ssize_t ca91cx42_master_write(struct > vme_master_resource *image, > } > > count32 = (count - done) & ~0x3; > - if (count32 > 0) { > - memcpy_toio(addr + done, buf + done, count32); > - done += count32; > + while (done < count32) { > + iowrite32(*(u32 *)(buf + done), addr + done); > + done += 4; > } > > if ((count - done) & 0x2) { > diff --git a/drivers/vme/bridges/vme_tsi148.c > b/drivers/vme/bridges/vme_tsi148.c > index 5fbd08f..bc82991 100644 > --- a/drivers/vme/bridges/vme_tsi148.c > +++ b/drivers/vme/bridges/vme_tsi148.c > @@ -1297,9 +1297,9 @@ static ssize_t tsi148_master_read(struct > vme_master_resource *image, void *buf, > } > > count32 = (count - done) & ~0x3; > - if (count32 > 0) { > - memcpy_fromio(buf + done, addr + done, count32); > - done += count32; > + if (done < count32) { > + *(u32 *)(buf + done) = ioread32(addr + done); > + done += 4; > } > > if ((count - done) & 0x2) { > @@ -1379,9 +1379,9 @@ static ssize_t tsi148_master_write(struct > vme_master_resource *image, void *buf, > } > > count32 = (count - done) & ~0x3; > - if (count32 > 0) { > - memcpy_toio(addr + done, buf + done, count32); > - done += count32; > + if (done < count32) { > + iowrite32(*(u32 *)(buf + done), addr + done); > + done += 4; > } > > if ((count - done) & 0x2) { > > >> >> Does that help? >> > > Yep :-) > > Thanks Greg. That certainly looks promising. I can give that patch a try later this week. Thanks. --Mike _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel