On Thu, Feb 6, 2014 at 12:40 AM, Martyn Welch <martyn.welch@xxxxxx> wrote: > > > On 05/02/14 23:21, Michael Kenney wrote: >> >> On Wed, Feb 5, 2014 at 1:38 PM, Michael Kenney <mfkenney@xxxxxxxxx> wrote: >>> >>> On Wed, Feb 5, 2014 at 1:22 PM, Martyn Welch <martyn@xxxxxxxxxxxx> wrote: >>>> >>>> >>>> >>>> >>>> On 5 February 2014 17:41, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: >>>>> >>>>> >>>>> On Tue, Feb 04, 2014 at 06:34:13PM +0000, Martyn Welch wrote: >>>>>> >>>>>> On 04/02/14 16:34, Michael Kenney wrote: >>>>>>> >>>>>>> Hi Martyn, >>>>>>> >>>>>>> On Tue, Feb 4, 2014 at 7:28 AM, Martyn Welch <martyn.welch@xxxxxx >>>>>>> <mailto:martyn.welch@xxxxxx>> wrote: >>>>>>> >>>>>>> On 28/12/13 00:34, Michael Kenney wrote: >>>>>>> >>>>>>> Hi Martyn, >>>>>>> >>>>>>> On Fri, Dec 27, 2013 at 4:23 PM, Martyn Welch >>>>>>> <martyn@xxxxxxxxxxxx <mailto:martyn@xxxxxxxxxxxx>> wrote: >>>>>>> >>>>>>> On 27/12/13 20:15, Michael Kenney wrote: >>>>>>> >>>>>>> >>>>>>> We are using the vme_tsi148 bridge driver along with >>>>>>> the >>>>>>> vme_user >>>>>>> driver to access the VME boards. The A/D board >>>>>>> requires >>>>>>> D32 bus cycles >>>>>>> and the VME master window is configured accordingly, >>>>>>> however, when >>>>>>> monitoring the bus cycles with a logic analyzer, we >>>>>>> noticed that the >>>>>>> CPU is transferring one byte at a time (i.e. four D8 >>>>>>> transfers rather >>>>>>> than one D32). >>>>>>> >>>>>>> Is this the expected behavior of the tsi148 driver? >>>>>>> >>>>>>> >>>>>>> Hi Mike, >>>>>>> >>>>>>> This is certainly not the expected behaviour - if the >>>>>>> window >>>>>>> is configured >>>>>>> for D32 then it should do 32 bit transfers where >>>>>>> possible. >>>>>>> >>>>>>> I've heard of this happening recently, but haven't yet >>>>>>> been >>>>>>> able to >>>>>>> replicate it. Which VME board are you running Linux on >>>>>>> and >>>>>>> which flavour of >>>>>>> Linux? >>>>>>> >>>>>>> >>>>>>> I'm running Debian 7.2 with kernel 3.2 on a Fastwel CPC600 >>>>>>> (Pentium M >>>>>>> based CPU board). >>>>>>> >>>>>>> >>>>>>> I haven't forgotten about this, still not sure exactly what is >>>>>>> happening. >>>>>>> >>>>>>> Is your install/kernel 32 or 64 bit? >>>>>>> >>>>>>> Are you doing single 32-bit transfers, or are you seeing this on >>>>>>> longer transfers (i.e. copying a buffer full of data)? >>>>>>> >>>>>>> >>>>>>> Thanks for getting back to me. >>>>>>> >>>>>>> I'm running a 32-bit kernel and I see this behavior on all transfers >>>>>>> regardless of buffer size. >>>>>>> >>>>>> >>>>>> Gah! Thought I could see what may be causing it in 64-bit kernels, but >>>>>> not >>>>>> 32-bit (my x86 asm is not particularly hot). >>>>>> >>>>>> I think we /may/ be hitting issues with how the memcpy function gets >>>>>> implemented on specific architectures. The tsi148 is a PCI/X to VME >>>>>> bridge, >>>>>> if it receives a series of 8-bit reads or writes, it translates these >>>>>> to >>>>>> 8-bit reads or writes on the VME bus, which is not necessarily what we >>>>>> want. >>>>>> >>>>>> According to the data sheet, the card you are using has an Intel >>>>>> 6300ESB >>>>>> ICH, which seems to be PCI/X (so we can rule out PCIe to PCI/X bridges >>>>>> or >>>>>> something like that doing nasty things). >>>>>> >>>>>> I think (if I follow everything correctly) then the memcpy for 32-bit >>>>>> is >>>>>> handled by __memcpy in arch/x86/include/asm/string_32.h: >>>>>> >>>>>> static __always_inline void *__memcpy(void *to, const void *from, >>>>>> size_t >>>>>> n) >>>>>> { >>>>>> int d0, d1, d2; >>>>>> asm volatile("rep ; movsl\n\t" >>>>>> "movl %4,%%ecx\n\t" >>>>>> "andl $3,%%ecx\n\t" >>>>>> "jz 1f\n\t" >>>>>> "rep ; movsb\n\t" >>>>>> "1:" >>>>>> : "=&c" (d0), "=&D" (d1), "=&S" (d2) >>>>>> : "0" (n / 4), "g" (n), "1" ((long)to), "2" >>>>>> ((long)from) >>>>>> : "memory"); >>>>>> return to; >>>>>> } >>>>>> >>>>>> I'd expected this function to use movl (32-bit moves) where possible, >>>>>> but >>>>>> movsb to get to naturally aligned moves (which is something that we >>>>>> deal >>>>>> with in the VME code already to use 16-bit reads where we can). >>>>>> >>>>>> Greg (as co-maintainer of the VME subsystem :-) ), Am I reading this >>>>>> right? >>>>> >>>>> >>>>> Yes, but why would it matter if we copy by bytes, bits, or dwords to a >>>>> memory-backed bus? By definition, that shouldn't matter. If it does >>>>> matter, then we shouldn't be using memcpy, or relying on the compiler >>>>> to >>>>> do the copying, but rather we need to use a write() function instead as >>>>> this would be IO memory that does care about this type of thing. >>>>> >>>> >>>> The drivers are using the functions memcpy_toio and memcpy_fromio for >>>> the >>>> bulk of the transfer, these seem to equate to memcpy on x86. >>>> >>>> In this instance it does matter - some devices can only accept certain >>>> width >>>> accesses and the VME bridge will translate a 8-bit PCI transfer to a >>>> 8-bit >>>> transfer on the VME bus. >>>> >>>> So (and I haven't even compile tested this yet), this?: >>>> >>>> diff --git a/drivers/vme/bridges/vme_ca91cx42.c >>>> b/drivers/vme/bridges/vme_ca91cx42.c >>>> index 1425d22c..0d87ebd 100644 >>>> --- a/drivers/vme/bridges/vme_ca91cx42.c >>>> +++ b/drivers/vme/bridges/vme_ca91cx42.c >>>> @@ -894,9 +894,9 @@ static ssize_t ca91cx42_master_read(struct >>>> vme_master_resource *image, >>>> } >>>> >>>> count32 = (count - done) & ~0x3; >>>> - if (count32 > 0) { >>>> - memcpy_fromio(buf + done, addr + done, (unsigned >>>> int)count); >>>> - done += count32; >>>> + while (done < count32) { >>>> + *(u32 *)(buf + done) = ioread32(addr + done); >>>> + done += 4; >>>> } >>>> >>>> if ((count - done) & 0x2) { >>>> @@ -948,9 +948,9 @@ static ssize_t ca91cx42_master_write(struct >>>> vme_master_resource *image, >>>> } >>>> >>>> count32 = (count - done) & ~0x3; >>>> - if (count32 > 0) { >>>> - memcpy_toio(addr + done, buf + done, count32); >>>> - done += count32; >>>> + while (done < count32) { >>>> + iowrite32(*(u32 *)(buf + done), addr + done); >>>> + done += 4; >>>> } >>>> >>>> if ((count - done) & 0x2) { >>>> diff --git a/drivers/vme/bridges/vme_tsi148.c >>>> b/drivers/vme/bridges/vme_tsi148.c >>>> index 5fbd08f..bc82991 100644 >>>> --- a/drivers/vme/bridges/vme_tsi148.c >>>> +++ b/drivers/vme/bridges/vme_tsi148.c >>>> @@ -1297,9 +1297,9 @@ static ssize_t tsi148_master_read(struct >>>> vme_master_resource *image, void *buf, >>>> } >>>> >>>> count32 = (count - done) & ~0x3; >>>> - if (count32 > 0) { >>>> - memcpy_fromio(buf + done, addr + done, count32); >>>> - done += count32; >>>> + if (done < count32) { >>>> + *(u32 *)(buf + done) = ioread32(addr + done); >>>> + done += 4; >>>> } >>>> >>>> if ((count - done) & 0x2) { >>>> @@ -1379,9 +1379,9 @@ static ssize_t tsi148_master_write(struct >>>> vme_master_resource *image, void *buf, >>>> } >>>> >>>> count32 = (count - done) & ~0x3; >>>> - if (count32 > 0) { >>>> - memcpy_toio(addr + done, buf + done, count32); >>>> - done += count32; >>>> + if (done < count32) { >>>> + iowrite32(*(u32 *)(buf + done), addr + done); >>>> + done += 4; >>>> } >>>> >>>> if ((count - done) & 0x2) { >>>> >>>> >>>>> >>>>> Does that help? >>>>> >>>> >>>> Yep :-) >>>> >>>> Thanks Greg. >>> >>> >>> That certainly looks promising. I can give that patch a try later this >>> week. >>> >>> Thanks. >> >> >> Good news Martyn. >> >> I had a bit of time today to test your patch and am happy to say it >> worked! No more bus errors. >> >> Now I can work on getting the actual data acquisition working :-) >> > > Thanks Mike. > > Guess you didn't spot the little bug then :-) Oops, I sure didn't. I was so happy to not see a bus error on my simple one-word read/write test that I hadn't gone any further yet :-). > > For the tsi148 I forgot to change the conditional from "if" to "while". > > I'll submit a proper patch in a minute, I'll give that a try today. Thanks. --Mike _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel