Re: [PATCH v4 5/8] dmaengine: dw-edma: Fix programming the source & dest addresses for ep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 11, 2022 at 11:25:56PM +0530, Manivannan Sadhasivam wrote:
> On Fri, Mar 11, 2022 at 03:38:57PM +0300, Serge Semin wrote:
> > @Manivannan could you join the discussion?
> > 
> > On Thu, Mar 10, 2022 at 02:16:17PM -0600, Zhi Li wrote:
> > > On Thu, Mar 10, 2022 at 1:38 PM Serge Semin <fancer.lancer@xxxxxxxxx> wrote:
> > > >
> > > > On Thu, Mar 10, 2022 at 10:50:14AM -0600, Zhi Li wrote:
> > > > > On Thu, Mar 10, 2022 at 10:32 AM Serge Semin <fancer.lancer@xxxxxxxxx> wrote:
> > > > > >
> > > > > > On Wed, Mar 09, 2022 at 03:12:01PM -0600, Frank Li wrote:
> > > > > > > From: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> > > > > > >
> > > > > > > When eDMA is controlled by the Endpoint (EP), the current logic incorrectly
> > > > > > > programs the source and destination addresses for read and write. Since the
> > > > > > > Root complex and Endpoint uses the opposite channels for read/write, fix the
> > > > > > > issue by finding out the read operation first and program the eDMA accordingly.
> > > > > > >
> > > > > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > > > > Fixes: bd96f1b2f43a ("dmaengine: dw-edma: support local dma device transfer semantics")
> > > > > > > Fixes: e63d79d1ffcd ("dmaengine: Add Synopsys eDMA IP core driver")
> > > > > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> > > > > > > Signed-off-by: Frank Li <Frank.Li@xxxxxxx>
> > > > > > > ---
> > > > > > > No change between v1 to v4
> > > > > > >
> > > > > > >  drivers/dma/dw-edma/dw-edma-core.c | 32 +++++++++++++++++++++++++++++-
> > > > > > >  1 file changed, 31 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
> > > > > > > index 66dc650577919..507f08db1aad3 100644
> > > > > > > --- a/drivers/dma/dw-edma/dw-edma-core.c
> > > > > > > +++ b/drivers/dma/dw-edma/dw-edma-core.c
> > > > > > > @@ -334,6 +334,7 @@ dw_edma_device_transfer(struct dw_edma_transfer *xfer)
> > > > > > >       struct dw_edma_chunk *chunk;
> > > > > > >       struct dw_edma_burst *burst;
> > > > > > >       struct dw_edma_desc *desc;
> > > > > > > +     bool read = false;
> > > > > > >       u32 cnt = 0;
> > > > > > >       int i;
> > > > > > >
> > > > > > > @@ -424,7 +425,36 @@ dw_edma_device_transfer(struct dw_edma_transfer *xfer)
> > > > > > >               chunk->ll_region.sz += burst->sz;
> > > > > > >               desc->alloc_sz += burst->sz;
> > > > > > >
> > > > > > > -             if (chan->dir == EDMA_DIR_WRITE) {
> > > > > > > +             /****************************************************************
> > > > > > > +              *
> > > > > >
> > > > > > > +              *        Root Complex                           Endpoint
> > > > > > > +              * +-----------------------+             +----------------------+
> > > > > > > +              * |                       |    TX CH    |                      |
> > > > > > > +              * |                       |             |                      |
> > > > > > > +              * |      DEV_TO_MEM       <-------------+     MEM_TO_DEV       |
> > > > > > > +              * |                       |             |                      |
> > > > > > > +              * |                       |             |                      |
> > > > > > > +              * |      MEM_TO_DEV       +------------->     DEV_TO_MEM       |
> > > > > > > +              * |                       |             |                      |
> > > > > > > +              * |                       |    RX CH    |                      |
> > > > > > > +              * +-----------------------+             +----------------------+
> > > > > > > +              *
> > > > > > > +              * If eDMA is controlled by the Root complex, TX channel
> > > > > > > +              * (EDMA_DIR_WRITE) is used for memory read (DEV_TO_MEM) and RX
> > > > > > > +              * channel (EDMA_DIR_READ) is used for memory write (MEM_TO_DEV).
> > > > > > > +              *
> > > > > > > +              * If eDMA is controlled by the endpoint, RX channel
> > > > > > > +              * (EDMA_DIR_READ) is used for memory read (DEV_TO_MEM) and TX
> > > > > > > +              * channel (EDMA_DIR_WRITE) is used for memory write (MEM_TO_DEV).
> > > > > >
> > > > > > Either I have some wrong notion about this issue, or something wrong
> > > > > > with the explanation above and with this fix below.
> > > > > >
> > > > > > From my understanding of the possible DW eDMA IP-core setups the
> > > > > > scatch above and the text below it are incorrect. Here is the way the
> > > > > > DW eDMA can be used:
> > > > > > 1) Embedded into the DW PCIe Host/EP controller. In this case
> > > > > > CPU/Application Memory is the memory of the CPU attached to the
> > > > > > host/EP controller, while the remote (link partner) memory is the PCIe
> > > > > > bus memory. In this case MEM_TO_DEV operation is supposed to be
> > > > > > performed by the Tx/Write channels, while the DEV_TO_MEM operation -
> > > > > > by the Rx/Read channels.
> > > > > >
> > > > > > Note it's applicable for both Host and End-point case, when Linux is
> > > > > > running on the CPU-side of the eDMA controller. So if it's DW PCIe
> > > > > > end-point, then MEM_TO_DEV means copying data from the local CPU
> > > > > > memory into the remote memory. In general the remote memory can be
> > > > > > either some PCIe device on the bus or the Root Complex' CPU memory,
> > > > > > each of which is some remote device anyway from the Local CPU
> > > > > > perspective.
> > > > > >
> > > > > > 2) Embedded into the PCIe EP. This case is implemented in the
> > > > > > drivers/dma/dw-edma/dw-edma-pcie.c driver. AFAICS from the commits log
> > > > > > and from the driver code, that device is a Synopsys PCIe EndPoint IP
> > > > > > prototype kit. It is a normal PCIe peripheral device with eDMA
> > > > > > embedded, which CPU/Application interface is connected to some
> > > > > > embedded SRAM while remote (link partner) interface is directed
> > > > > > towards the PCIe bus. At the same time the device is setup and handled
> > > > > > by the code running on a CPU connected to the PCIe Host controller.  I
> > > > > > think that in order to preserve the normal DMA operations semantics we
> > > > > > still need to consider the MEM_TO_DEV/DEV_TO_MEM operations from the
> > > > > > host CPU perspective, since that's the side the DMA controller is
> > > > > > supposed to be setup from.  In this MEM_TO_DEV is supposed to be used
> > > > > > to copy data from the host CPU memory into the remote device memory.
> > > > > > It means to allocate Rx/Read channel on the eDMA controller, so one
> > > > > > would be read data from the Local CPU memory and copied it to the PCIe
> > > > > > device SRAM. The logic of the DEV_TO_MEM direction would be just
> > > > > > flipped. The eDMA PCIe device shall use Tx/Write channel to copy data
> > > > > > from it's SRAM into the Host CPU memory.
> > > > > >
> > > > > > Please note as I understand the case 2) describes the Synopsys PCIe
> > > > > > EndPoint IP prototype kit, which is based on some FPGA code. It's just
> > > > > > a test setup with no real application, while the case 1) is a real setup
> > > > > > available on our SoC and I guess on yours.
> > > > >
> > > >
> > > > > I think yes. But Remote EP also is a one kind of usage module. Just no one
> > > > > writes an EP functional driver for it yet.  Even pci-epf-test was just
> > > > > a test function.
> > > > > I previously sent vNTB patches to implement a virtual network between
> > > > > RC and EP,
> > > > > you can look if you have interest.
> > > >
> > > > AFAIU the remote EP case is the same as 1) anyway. The remote EP is
> > > > handled by its own CPU, which sets up the DW PCIe EP controller
> > > > together with eDMA synthesized into the CPU' SoC. Am I right? While
> > > > the case 2) doesn't have any CPU attached on the PCIe EP. It's just an
> > > > FPGA with PCIe interface and eDMA IP-core installed. In that case all
> > > > the setups are performed by the PCIe Host CPU. That's the root problem
> > > > that causes having all the DEV_TO_MEM/MEM_TO_DEV complications.
> > > >
> > > > So to speak I would suggest for at least to have the scatch fixed in
> > > > accordance with the logic explained in my message.
> > > >
> > > > >
> > > > > >
> > > > > > So what I suggest in the framework of this patch is just to implement
> > > > > > the case 1) only. While the case 2) as it's an artificial one can be
> > > > > > manually handled by the DMA client drivers. BTW There aren't ones available
> > > > > > in the kernel anyway. The only exception is an old-time attempt to get
> > > > > > an eDMA IP test-driver mainlined into the kernel:
> > > > > > https://patchwork.kernel.org/project/linux-pci/patch/cc195ac53839b318764c8f6502002cd6d933a923.1547230339.git.gustavo.pimentel@xxxxxxxxxxxx/
> > > > > > But it was long time ago. So it's unlikely to be accepted at all.
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > -Sergey
> > > > > >
> > > > > > > +              *
> > > > > > > +              ****************************************************************/
> > > > > > > +
> > > > > >
> > > > > > > +             if ((dir == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_READ) ||
> > > > > > > +                 (dir == DMA_DEV_TO_MEM && chan->dir == EDMA_DIR_WRITE))
> > > > > > > +                     read = true;
> > > > > >
> > > >
> > > > > > Seeing the driver support only two directions DMA_DEV_TO_MEM/DMA_DEV_TO_MEM
> > > > > > and EDMA_DIR_READ/EDMA_DIR_WRITE, this conditional statement seems
> > > > > > redundant.
> > > >
> > > > Am I getting a response on this comment? In accordance with that
> > > > conditional statement having dir == DMA_DEV_TO_MEM means performing
> > > > read operation. If dir equals DMA_MEM_TO_DEV then a write operation
> > > > will be performed. The code path doesn't depend on the chan->dir
> > > > value.
> > > 
> > 
> > > Only dir is enough.
> > 
> > Right, in this case the fix is much simpler than suggested here. There
> > is no need in additional local variable and complex conditional
> > statement. It's supposed to be like this:
> > 
> > -             if (chan->dir == edma_dir_write) {
> > +             if (dir == DMA_DEV_TO_MEM) {
> > 
> > See my next comment for a detailed explanation.
> > 
> > > Remote Read,  DMA_DEV_TO_MEM, it is a write channel.
> > > SAR is the continual address at EP Side, DAR is a scatter list. RC side
> > > 
> > > Local Read,  DMA_DEV_TO_MEM, it is a reading channel.
> > > SAR is the continual address at RC side,  DAR is a scatter list at EP side
> > 
> > Right, it's a caller responsibility to use a right channel for the
> > operation (by flipping the channel the caller will invert the whole
> > logic). But As I see it what you explain and my notion don't match to what
> > is depicted on the scatch and written in the text below it. Don't you see?
> > 
> > -              *        Root Complex                           Endpoint
> > +              * Linux Root Port/End-point                  PCIe End-point
> >                * +-----------------------+             +----------------------+
> > -              * |                       |    TX CH    |                      |
> > -              * |                       |             |                      |
> > -              * |      DEV_TO_MEM       <-------------+     MEM_TO_DEV       |
> > +              * |                       |             |                      |
> > +              * |                       |             |                      |
> > +              * |    DEV_TO_MEM   Rx Ch <-------------+ Tx Ch  DEV_TO_MEM    |
> >                * |                       |             |                      |
> >                * |                       |             |                      |
> > -              * |      MEM_TO_DEV       +------------->     DEV_TO_MEM       |
> > -              * |                       |             |                      |
> > -              * |                       |    RX CH    |                      |
> > +              * |    MEM_TO_DEV   Tx Ch +-------------> Rx Ch  MEM_TO_DEV    |
> > +              * |                       |             |                      |
> > +              * |                       |             |                      |
> >                * +-----------------------+             +----------------------+
> >                *
> > -              * If eDMA is controlled by the Root complex, TX channel
> > -              * (EDMA_DIR_WRITE) is used for memory read (DEV_TO_MEM) and RX
> > -              * channel (EDMA_DIR_READ) is used for memory write (MEM_TO_DEV).
> > +              * If eDMA is controlled by the RP/EP, Rx channel
> > +              * (EDMA_DIR_READ) is used for device read (DEV_TO_MEM) and Tx
> > +              * channel (EDMA_DIR_WRITE) is used for device write (MEM_TO_DEV).
> > +              * (Straightforward case.)
> >                *
> > -              * If eDMA is controlled by the endpoint, RX channel
> > -              * (EDMA_DIR_READ) is used for memory read (DEV_TO_MEM) and TX
> > -              * channel (EDMA_DIR_WRITE) is used for memory write (MEM_TO_DEV).
> > +              * If eDMA is embedded into an independent PCIe EP, Tx channel
> > +              * (EDMA_DIR_WRITE) is used for device read (DEV_TO_MEM) and Rx
> > +              * channel (EDMA_DIR_READ) is used for device write (MEM_TO_DEV).
> > 
> > I think what was suggested above explains well the semantics you are
> > trying to implement here in the framework of this patch.
> > 
> > > 
> > > Actually,  both sides should support a scatter list. Like
> > > device_prep_dma_memcpy_sg
> > > but it is beyond this patch series.
> > 
> 

> As I said in other reply, my reasoning was entirely based on the eDMA test
> application. Since it came from Synopsys, I went with that logic of channel
> configuration.
> 
> But if someone from Synopsys confirms that your logic is correct, I'm fine with
> reworking my commit.

There is no need in waiting for the Synopsys response. I've got info
manuals of my own to say:
1) Your fix is correct, but the conditional statement has redundant
operations. It can be simplified as I suggested.
2) Your scatch and text are incorrect and they must be fixed so not to
be confusing.

-Sergey

> 
> Thanks,
> Mani
> 
> > Right, it's beyond your series too, because that feature requires
> > additional modifications. I am not asking about that.
> > 
> > -Sergey
> > 
> > > 
> > > >
> > > > -Sergey
> > > >
> > > > > >
> > > > > > > +
> > > > > > > +             /* Program the source and destination addresses for DMA read/write */
> > > > > > > +             if (read) {
> > > > > > >                       burst->sar = src_addr;
> > > > > > >                       if (xfer->type == EDMA_XFER_CYCLIC) {
> > > > > > >                               burst->dar = xfer->xfer.cyclic.paddr;
> > > > > > > --
> > > > > > > 2.24.0.rc1
> > > > > > >



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux