RE: [PATCH v2 5/5] libnvdimm: add DMA support for pmem blk-mq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Dave Jiang
> On 08/03/2017 09:14 AM, Dan Williams wrote:
> > On Thu, Aug 3, 2017 at 8:55 AM, Vinod Koul <vinod.koul@xxxxxxxxx> wrote:
> >> On Thu, Aug 03, 2017 at 08:06:07PM +0530, Jiang, Dave wrote:
> >>>> On Aug 3, 2017, at 1:56 AM, Koul, Vinod <vinod.koul@xxxxxxxxx> wrote:
> >>>>> On Thu, Aug 03, 2017 at 11:06:13AM +0530, Jiang, Dave wrote:
> >>>>>>> On Aug 2, 2017, at 10:25 PM, Koul, Vinod <vinod.koul@xxxxxxxxx> wrote:
> >>>>>>> On Thu, Aug 03, 2017 at 10:41:51AM +0530, Jiang, Dave wrote:
> >>>>>>>>> On Aug 2, 2017, at 9:58 PM, Koul, Vinod <vinod.koul@xxxxxxxxx> wrote:
> >>>>>>>>> On Wed, Aug 02, 2017 at 02:13:56PM -0700, Dave Jiang wrote:
> >>>>>>>>>> On 08/02/2017 02:10 PM, Sinan Kaya wrote:
> >>>>>>>>>> On 8/2/2017 4:52 PM, Dave Jiang wrote:
> >>>>>>>>>>>> Do we need a new API / new function, or new capability?
> >>>>>>>>>>> Hmmm...you are right. I wonder if we need something like DMA_SG cap....
> >>>>>>>>>>
> >>>>>>>>>> Unfortunately, DMA_SG means something else. Maybe, we need DMA_MEMCPY_SG
> >>>>>>>>>> to be similar with DMA_MEMSET_SG.
> >>>>>>>>>
> >>>>>>>>> I'm ok with that if Vinod is.
> >>>>>>>>
> >>>>>>>> So what exactly is the ask here, are you trying to do MEMCPY or SG or MEMSET
> >>>>>>>> or all :). We should have done bitfields for this though...
> >>>>>>>
> >>>>>>> Add DMA_MEMCPY_SG to transaction type.
> >>>>>>
> >>>>>> Not MEMSET right, then why not use DMA_SG, DMA_SG is supposed for
> >>>>>> scatterlist to scatterlist copy which is used to check for
> >>>>>> device_prep_dma_sg() calls
> >>>>>>
> >>>>> Right. But we are doing flat buffer to/from scatterlist, not sg to sg. So
> >>>>> we need something separate than what DMA_SG is used for.
> >>>>
> >>>> Hmm, its SG-buffer  and its memcpy, so should we call it DMA_SG_BUFFER,
> >>>> since it is not memset (or is it) I would not call it memset, or maybe we
> >>>> should also change DMA_SG to DMA_SG_SG to make it terribly clear :D
> >>>
> >>> I can create patches for both.
> >>
> >> Great, anyone who disagrees or can give better names :)
> >
> > All my suggestions would involve a lot more work. If we had infinite
> > time we'd stop with the per-operation-type entry points and make this
> > look like a typical driver sub-system that takes commands like
> > block-devices or usb, but perhaps that ship has sailed.
> 
> Allen, isn't this what we were just talking about on IRC yesterday?
> 
> <allenbh> I dislike prep_tx grabbing a device-specific descriptor.  The
> device really only needs as many device-specific descriptors will keep
> the hw busy, and just let the logical layer queue up logical descriptors
> until hw descriptors become available.  Let the client allocate
> descriptors to submit, and hw driver can translate to a
> hardware-specific descriptor at the last moment before notifying the hw.

Yeah, that last part, "like a typical driver sub-system that takes commands" +1!

Especially if the client can submit a list of commands.  I have an rdma driver using the dmaengine api, but it would be more natural to build up a list of dma operations, just using my driver's own memory and the struct definition of some abstract dma descriptor, not calling any functions of the dmaengine api.  After building the list of descriptors, submit the entire list all at once to the hardware driver, or not at all.  Instead, in perf I see a lot of heat on that spinlock for each individual dma prep in ioat (sorry for picking on you, Dave, this is the only dma engine hw and driver that I am familiar with).

Also, I think we could glean some more efficiency if the dma completion path took a hint from napi_schedule() and napi->poll(), instead of scheduling a tasklet directly in the dmaengine hw driver.  For one thing, it could reduce the number of interrupts.  Also, coordinating the servicing of dma completions with the posting of new work, with the schedule determined from the top of the stack down, could further reduce contention on that lock between dma prep and cleanup/complete.

http://elixir.free-electrons.com/linux/v4.12.4/source/drivers/dma/ioat/dma.c#L448
http://elixir.free-electrons.com/linux/v4.12.4/source/drivers/dma/ioat/dma.c#L652

Also, I apologize for not offering to do all this work.  If I was ready to jump in do it, I would have spoken up earlier.

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux