On Fri, Nov 16, 2018 at 3:30 AM Ahmed S. Darwish <darwish.07@xxxxxxxxx> wrote: > > Hi Todd, > > On Tue, Nov 06, 2018 at 04:20:49PM -0800, Todd Poynor wrote: > > On Mon, Sep 10, 2018 at 8:28 AM Ahmed S. Darwish <darwish.07@xxxxxxxxx> wrote: > > > > > > The gasket in-kernel framework, recently introduced under staging, > > > re-implements what is already long-time provided by the UIO > > > subsystem, with extra PCI BAR remapping and MSI conveniences. > > > > > > Before moving it out of staging, make sure we add the new bits to > > > the UIO framework instead, then transform its signle client, the > > > Apex driver, to a proper UIO driver (uio_driver.h). > > > > > > Link: https://lkml.kernel.org/r/20180828103817.GB1397@do-kernel > > > > So I'm looking at this for reals now. The BAR mapping stuff is > > straightforward with the existing framework. Everything else could be > > done outside of UIO via the existing device interface, but figured I'd > > collect any opinions about adding the new bits to UIO. > > > > If it won't slow you down, I'd actually be more than happy to get > involved in adding the necessary bits to UIO. Hey, Darwi, Sounds good to me, thanks, I'm in no hurry about this, as we've basically shipped the first version with the admittedly not-aligned-with-upstream driver as it stands today, and are ready to take the time to get this in proper shape for the next go-round. Using UIO for BAR-mapped regions is perfectly straightforward, of course. I have a hack patch to do this for Apex, but I haven;t hacked up Gasket to use these regions instead of mapping the memory itself yet. I could continue that work soon if it's useful. The DMA issues are significant. The VFIO framework addresses this, and it's not trivial. VFIO also does some stuff that Greg specifically objected to in Gasket, wrapping the core PCI subsystem device management calls, so that doesn't look like an easy path for us either. VFIO also doesn't seem to accommodate buffers either allocated coherent/non-cached or with explicit cache maintenance from userspace, but if anybody knows differently it's appreciated. Edge TPUs (which is what Apex is) have performance-critical paths where userspace feeds commands in a long-lived buffer that needs to either be non-cached or needs cache flushes to get the data pushed out of CPU complex cache, and vice-versa for the responses from the device. We could also move the page_table code to userspace if we have a solution for this, and rely on host-side IOMMU to provide the necessary isolation / protection from malicious mappings. The low-hanging fruit in extending UIO, as I see it, is multiple IRQ support, which should be nicely representable as sysfs attributes. And in particular, pci_enable_msix_range() support as is standard for PCI drivers, which could make a nice addition to uio_pci_generic as well. My plan was to foist that on uio upstream first. Feel free to jump in if you've got the cycles, as I'm having a hard time scraping together the time required these days. Thanks -- Todd > > Thanks! > > -- > Darwi > http://darwish.chasingpointers.com _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel