Re: extra large DMA buffer for PCI-E device under UIO

Jean-Francois Dagenais <jeff.dagenais@xxxxxxxxx> · Tue, 22 Nov 2011 14:57:58 -0500



On Nov 18, 2011, at 17:08, Greg KH wrote:

> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
>> Hello fellow hackers.
>> 
>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
>> 
>> I map BAR0 of the device to userspace. I also map two memory areas,
>> one is used to feed instructions to the acquisition device, the other
>> is used autonomously by the PCI device to write the acquired data.
> 
> Nice, have a pointer to your driver anywhere so we can include it in the
> main kernel tree to make your life easier?
> 
>> The strategy we have been using for those two share memory areas has
>> historically been using pci_alloc_coherent on v2.6.35 x86_64 (limited
>> to 4MB based on my trials) and later, I made use of the VT-d
>> (intel_iommu) to allocate as much as 128MB (an arbitrary limit) which
>> appear contiguous to the PCI device. I use vmalloc_user to allocate
>> 128M, then write all the physically continuous segments in a
>> scatterlist, then use pci_map_sg which works it's way to intel_iommu.
>> The device DMA addresses I get back are contiguous over the whole
>> 128M. Neat! Our VT-d capable devices still use this strategy.
>> 
>> This large memory is mission-critical in making the acquisition device
>> autonomous (real-time), yet keep the DMA implementation very simple.
>> Today, we are re-using this device on a CPU architecture that has no
>> IOMMU (intel E6XX/EG20T) and want to avoid creating a scatter-gather
>> scheme between my driver and the FPGA (PCI device).
>> 
>> So I went back to the old pci_alloc_coherent method, which although
>> limited to 4 MB, will do for early development phases. Instead of
>> 2.6.35, we are doing preliminary development using 2.6.37 and will
>> probably use 3.1 or more later.  The cpu/device shared memory maps
>> (1MB and 4MB) are allocated using pci_alloc_coherent and handed to UIO
>> as physical memory using the dma_addr_t returned by the pci_alloc
>> func.
>> 
>> The 1st memory map is written to by CPU and read from device.
>> The 2nd memory map is typically written by the device and read by the
>> CPU, but future features may have the device also read this memory.
>> 
>> My initial testing on the atom E6XX show the PCI device failing when
>> trying to read from the first memory map. I suspect PCI-E payload
>> sizes which may be somewhat hardcoded in the FPGA firmware... we will
>> confirm this soon.
> 
> That would be good to find out.
Just FYI,
To close the loop on the right above issue... The problem we had was that the
FPGA was using 64-bit formatted TLPs for it's read and write requests to the
system's <4Gig RAM, which is said by PCI-E to be unsupported.

This has never been a problem on the other systems we used, i.e.
Core2/ICH9M, and Atom-Z5xx/SCH-US15W.
> 
>> Now from the get go I have felt lucky to have made this work because
>> of my limited research into the intricacies of the kernel's memory
>> management. So I ask two things:
>> 
>> - Is this kosher?
> 
> I think so, yes, but others who know the DMA subsystem better than I
> should chime in here, as I might be totally wrong.
> 
>> - Is there a better/easier/safer way to achieve this? (remember that
>> for the second map, the more memory I have, the better. We have a gig
>> of ram, if I take, say 256MB, that would be OK too.
>> 
>> I had thought about cutting out a chunk of ram from the kernel's boot
>> args, but had always feared cache/snooping errors. Not to mention I
>> had no idea how to "claim" or setup this memory once my driver's probe
>> function. Maybe I would still be lucky and it would just work? mmmh...
> 
> Yeah, don't do that, it might not work out well.
> 
> greg k-h

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html