Re: [RFC PATCH v1] Xilinx AXI-Stream FIFO v4.1 IP core driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 09, 2018 at 09:43:28AM +0200, Greg KH wrote:
> On Sun, Jul 08, 2018 at 02:25:38PM -0400, jacob feder wrote:
> > On Sun, Jul 8, 2018 at 9:28 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > 
> >     On Sat, Jul 07, 2018 at 10:19:40PM -0400, Jacob Feder wrote:
> >     > Hi all,
> >     > I have developed this driver for a Xilinx-provided IP block for their
> >     Zynq
> >     > SoC. I fixed all of the checkpatch.pl problems I knew how to. If someone
> >     > could chime in on how to fix the remaining few it would be appreciated.
> >     >
> >     > Also looking for feedback on general structure. It's my first driver (and
> >     > patch submission) so I'm sure there are plenty of things to be improved
> >     on
> >     > :).
> >     >
> >     > Functionally everything works OK except the performance isn't as good as
> >     I
> >     > was hoping for. I have been testing it by operating the FIFO in loopback
> >     > mode (AXI-Stream TX interface feeding back into RX interface) running on
> >     > the XC7Z020 (7000 series) Zynq device. I am getting anything between
> >     > 3-16MB/s depending on the amount of data transferred. The performance
> >     > claimed by the PG080 datasheet is ~65MB/s. The CPU isn't under
> >     significant
> >     > load (~12%) as reported by top so I don't think that's the bottleneck.
> >     >
> >     > Please +CC in responses as I'm not on the mailing list.
> >     >
> >     > Cheers
> >     >
> >     >
> >     > This IP core has read and write AXI-Stream FIFOs, the contents of which
> >     can
> >     > be accessed from the AXI4 memory-mapped interface. This is useful for
> >     > transferring data from a processor into the FPGA fabric. The driver
> >     creates
> >     > a character device that can be read/written to with standard
> >     > open/read/write/close.
> > 
> >     Why not use the uio api, which allows userspace to mmap the memory of
> >     the device and access it directly from userspace?  That should make
> >     things a lot faster, right?
> > 
> > 
> > 
> > I thought about the UIO method but based on what I read it seemed like more of
> > a hack (and also doesn't expose interrupts?). Whether it would make
> > things faster I have no idea.
> 
> Directly mmap is faster than read/write in a serial way, right?

I'm not sure - to me it seems the same unless mmap is using a different
technique under the hood. With mmap you would still need to write serially
to the FIFO registers. E.g. you can't write the data to a block of memory.
It all has to go to the same spot (TDFD register or RDFD for reading).

> 
> > 
> >     Or if that doesn't work, what about the fpga API the kernel now has?
> >     Would that work for this hardware?
> > 
> > 
> > 
> > I'm not totally sure what you're referring to here, but I think the FPGA kernel
> > drivers are for downloading bitstreams to the FPGA (bitstream is
> > equivalent to asm for fpgas), which isn't what I'm trying to do. 
> 
> Ah, ok.
> 
> >     >
> >     > See Xilinx PG080 document for IP details.
> > 
> >     Do you have a link to that?  If so, can you put it in here?
> > 
> > 
> > 
> > https://www.xilinx.com/support/documentation/ip_documentation/axi_fifo_mm_s/
> > v4_1/pg080-axi-fifo-mm-s.pdf
> 
> Can you put this in the changelog message please?
> 

Yes will do on v2.

> >     > Currently supports only store-forward mode with a 32-bit
> >     > AXI4-Lite interface. DOES NOT support:
> >     >       - cut-through mode
> >     >       - AXI4 (non-lite)
> >     >
> >     > Signed-off-by: Jacob Feder <jacobsfeder@xxxxxxxxx>
> >     > ---
> >     >  drivers/staging/axisfifo/axis-fifo.c | 1296
> >     ++++++++++++++++++++++++++++++++++
> >     >  drivers/staging/axisfifo/axis-fifo.h |  119 ++++
> >     >  2 files changed, 1415 insertions(+)
> >     >  create mode 100644 drivers/staging/axisfifo/axis-fifo.c
> >     >  create mode 100644 drivers/staging/axisfifo/axis-fifo.h
> > 
> >     Why does a single .c file need a .h file?
> > 
> > 
> > 
> > Good point... this can be consolidated :)
> >  
> > 
> >     I'll be glad to take this driver, as others can clean it up in-tree (I
> >     think your locking is crazy and is probably causing a lot of performance
> >     issues), but I need a TODO file for it listing what you think is needed
> >     to do in order to get this out of the staging tree?
> > 
> > 
> > 
> > I'm confused about why you don't like the locking - all I'm doing is locking on
> > open() calls to prevent multiple userspace apps from reading/writing
> > to the fifo simultaneously.  This shouldn't reduce performance because
> > the mutexes are only tested on open() not on read() or write().
> > Presumably the user is only opening once.
> 
> Trying to "protect" userspace from itself is not really a good idea.
> Grabbing a lock over open/close is ripe for problems.  You really are
> not keeping multiple owners from accessing the device at the same time,
> think about passing around a file descriptor to different processes or
> threads.  Your kernel code does not protect from that, so it is best
> just not to worry about that at all.  If userspace does foolish things,
> let it deal with the result :)
> 

Ok, I see. Will remove the mutexes :)

> > I think locking is necessary - if the hardware registers are accessed in the
> > wrong order it goes into an unknown state and must be reset (and will
> > probably cause a kernel panic).
> 
> How will the kernel crash because of this?
> 

Good question...I couldn't tell you how but it's happened to me before by
writing to the IP registers in the wrong order.

> >     Or, if you thin it should use the fpga or uio interface instead, maybe
> >     it's just easier to redo it based on that?
> > 
> >     thanks,
> > 
> >     greg k-h
> > 
> > 
> > I have made some slight modifications today. Let me know if there are any other
> > things you
> > think I should change. I can integrate those then resubmit a v2 for you to
> > bring into the tree.
> > 
> > In terms of TODO it's all ok as far as I'm concerned unless someone wants to
> > look into the performance.
> > (actually see below - I think this could explain why performance of my system
> > is lower - xilinx is using
> > some sort of DMA driver instead of io_remap)
> > https://forums.xilinx.com/xlnx/attachments/xlnx/ELINUX/13011/2/
> > Linux-DMA-In-Device-Drivers.pdf
> 
> mmap might be the best for what you want, try uio out and see if that
> helps in your speed...
> 
> thanks,
> 
> greg k-h

Thanks,
Jacob
_______________________________________________
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxx
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel




[Index of Archives]     [Linux Driver Backports]     [DMA Engine]     [Linux GPIO]     [Linux SPI]     [Video for Linux]     [Linux USB Devel]     [Linux Coverity]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux