Am 17.09.19 um 15:45 schrieb Daniel Vetter: > On Tue, Sep 17, 2019 at 01:24:10PM +0000, Koenig, Christian wrote: >> Am 17.09.19 um 15:13 schrieb Daniel Vetter: >>> On Tue, Sep 17, 2019 at 12:40:51PM +0000, Koenig, Christian wrote: >>>> Am 17.09.19 um 14:31 schrieb Daniel Vetter: >>>>> On Mon, Sep 16, 2019 at 02:23:13PM +0200, Christian König wrote: >>>>>> Ping? Any further comment on this or can't we merge at least the locking >>>>>> change? >>>>> I was at plumbers ... >>>>>> Christian. >>>>>> >>>>>> Am 11.09.19 um 12:53 schrieb Christian König: >>>>>>> Am 03.09.19 um 10:05 schrieb Daniel Vetter: >>>>>>>> On Thu, Aug 29, 2019 at 04:29:14PM +0200, Christian König wrote: >>>>>>>>> This patch is a stripped down version of the locking changes >>>>>>>>> necessary to support dynamic DMA-buf handling. >>>>>>>>> >>>>>>>>> For compatibility we cache the DMA-buf mapping as soon as >>>>>>>>> exporter/importer disagree on the dynamic handling. >>>>>>>>> >>>>>>>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>>>>>>>> --- >>>>>>>>> drivers/dma-buf/dma-buf.c | 90 >>>>>>>>> ++++++++++++++++++++++++++++++++++++--- >>>>>>>>> include/linux/dma-buf.h | 51 +++++++++++++++++++++- >>>>>>>>> 2 files changed, 133 insertions(+), 8 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>>>>>>>> index 433d91d710e4..65052d52602b 100644 >>>>>>>>> --- a/drivers/dma-buf/dma-buf.c >>>>>>>>> +++ b/drivers/dma-buf/dma-buf.c >>>>>>>>> @@ -525,6 +525,10 @@ struct dma_buf *dma_buf_export(const struct >>>>>>>>> dma_buf_export_info *exp_info) >>>>>>>>> return ERR_PTR(-EINVAL); >>>>>>>>> } >>>>>>>>> + if (WARN_ON(exp_info->ops->cache_sgt_mapping && >>>>>>>>> + exp_info->ops->dynamic_mapping)) >>>>>>>>> + return ERR_PTR(-EINVAL); >>>>>>>>> + >>>>>>>>> if (!try_module_get(exp_info->owner)) >>>>>>>>> return ERR_PTR(-ENOENT); >>>>>>>>> @@ -645,10 +649,11 @@ void dma_buf_put(struct dma_buf *dmabuf) >>>>>>>>> EXPORT_SYMBOL_GPL(dma_buf_put); >>>>>>>>> /** >>>>>>>>> - * dma_buf_attach - Add the device to dma_buf's attachments >>>>>>>>> list; optionally, >>>>>>>>> + * dma_buf_dynamic_attach - Add the device to dma_buf's >>>>>>>>> attachments list; optionally, >>>>>>>>> * calls attach() of dma_buf_ops to allow device-specific >>>>>>>>> attach functionality >>>>>>>>> - * @dmabuf: [in] buffer to attach device to. >>>>>>>>> - * @dev: [in] device to be attached. >>>>>>>>> + * @dmabuf: [in] buffer to attach device to. >>>>>>>>> + * @dev: [in] device to be attached. >>>>>>>>> + * @dynamic_mapping: [in] calling convention for map/unmap >>>>>>>>> * >>>>>>>>> * Returns struct dma_buf_attachment pointer for this >>>>>>>>> attachment. Attachments >>>>>>>>> * must be cleaned up by calling dma_buf_detach(). >>>>>>>>> @@ -662,8 +667,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); >>>>>>>>> * accessible to @dev, and cannot be moved to a more suitable >>>>>>>>> place. This is >>>>>>>>> * indicated with the error code -EBUSY. >>>>>>>>> */ >>>>>>>>> -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, >>>>>>>>> - struct device *dev) >>>>>>>>> +struct dma_buf_attachment * >>>>>>>>> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, >>>>>>>>> + bool dynamic_mapping) >>>>>>>>> { >>>>>>>>> struct dma_buf_attachment *attach; >>>>>>>>> int ret; >>>>>>>>> @@ -677,6 +683,7 @@ struct dma_buf_attachment >>>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf, >>>>>>>>> attach->dev = dev; >>>>>>>>> attach->dmabuf = dmabuf; >>>>>>>>> + attach->dynamic_mapping = dynamic_mapping; >>>>>>>>> mutex_lock(&dmabuf->lock); >>>>>>>>> @@ -685,16 +692,64 @@ struct dma_buf_attachment >>>>>>>>> *dma_buf_attach(struct dma_buf *dmabuf, >>>>>>>>> if (ret) >>>>>>>>> goto err_attach; >>>>>>>>> } >>>>>>>>> + dma_resv_lock(dmabuf->resv, NULL); >>>>>>>>> list_add(&attach->node, &dmabuf->attachments); >>>>>>>>> + dma_resv_unlock(dmabuf->resv); >>>>>>>>> mutex_unlock(&dmabuf->lock); >>>>>>>>> + /* When either the importer or the exporter can't handle dynamic >>>>>>>>> + * mappings we cache the mapping here to avoid issues with the >>>>>>>>> + * reservation object lock. >>>>>>>>> + */ >>>>>>>>> + if (dma_buf_attachment_is_dynamic(attach) != >>>>>>>>> + dma_buf_is_dynamic(dmabuf)) { >>>>>>>>> + struct sg_table *sgt; >>>>>>>>> + >>>>>>>>> + if (dma_buf_is_dynamic(attach->dmabuf)) >>>>>>>>> + dma_resv_lock(attach->dmabuf->resv, NULL); >>>>>>>>> + >>>>>>>>> + sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL); >>>>>>>> Now we're back to enforcing DMA_BIDI, which works nicely around the >>>>>>>> locking pain, but apparently upsets the arm-soc folks who want to >>>>>>>> control >>>>>>>> this better. >>>>>>> Take another look at dma_buf_map_attachment(), we still try to get the >>>>>>> caching there for ARM. >>>>>>> >>>>>>> What we do here is to bidirectionally map the buffer to avoid the >>>>>>> locking hydra when importer and exporter disagree on locking. >>>>>>> >>>>>>> So the ARM folks can easily avoid that by switching to dynamic locking >>>>>>> for both. >>>>> So you still break the contract between importer and exporter, except not >>>>> for anything that's run in intel-gfx-ci so all is good? >>>> No, the contract between importer and exporter stays exactly the same it >>>> is currently as long as you don't switch to dynamic dma-buf handling. >>>> >>>> There is no functional change for the ARM folks here. The only change >>>> which takes effect is between i915 and amdgpu and that is perfectly >>>> covered by intel-gfx-ci. >>> There's people who want to run amdgpu on ARM? >> Sure there are, we even recently fixed some bugs for this. >> >> But as far as I know there is no one currently which is affect by this >> change on ARM with amdgpu. > But don't you break them with this now? No, we see the bidirectional attachment as compatible with the other ones. > amdgpu will soon set the dynamic flag on exports, which forces the caching > at create time (to avoid the locking fun), which will then result in a > EBUSY at map_attachment time because we have a cached mapping, but it's > the wrong type. See the check in dma_buf_map_attachment(): if (attach->dir != direction && attach->dir != DMA_BIDIRECTIONAL) return ERR_PTR(-EBUSY); Regards, Christian. > >>> Also, x86 doesn't have cache >>> flushing in the dma-api, so naturally this isn't any issue for us (we >>> still have cache flushing in actual hw, but that's a different topic). So >>> "works on x86" isn't really a great way to justify what we do here I >>> think. >> Well it is the exact same caching we previously had as well, so there is >> absolutely no functional change here except that we now explicitly note >> that amdgpu always needs bidirectional mappings. >> >> I agree that we should get rid of this caching as soon as possible, but >> we should not fix things which where broken before. >> >> On the other hand adding dma_sg_sync_for_cpu/device sounds like >> something we could easily add separately to the caching if you think >> that this will help. > The current code maybe lacks some cache flushes, but we already require a > fixed direction per attachment. So I guess not a real problem, probably. > > But with your patches I think we now fail with EBUSY. Not exactly nice ... > -Daniel > >> Christian. >> >>> -Daniel >>> >>>> Regards, >>>> Christian. >>>> >>>>> The other issue with "we solve this with caching the mapping": Currently >>>>> map/unmap flush (at least on arm, at least on cases where it matters). If >>>>> you just return the cached sg, then we don't do the flushing anymore, >>>>> which might break importers/exporters in exactly the same way as just >>>>> giving them the wrong mapping. There's zero differences between a BIDI, >>>>> TO_CPU, or TO_DEVICE mapping, the only places where this matters is for >>>>> cache flushing. >>>>> >>>>> So here's something that could actually work: >>>>> - We cache the mapping. >>>>> - We cache a bidirectional mapping. >>>>> - We put the right dma_sg_sync_for_cpu/device calls in place for map/unmap >>>>> to give current importers/exporters the same behaviour they're used to >>>>> now. >>>>> >>>>> And yes the caching we've lifted might have broken something somewhere >>>>> already. But generally you only hear about that long time after because >>>>> arm vendors roll forward once every few years. Or something like that. >>>>> -Daniel >>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>>> >>>>>>>> That's why your previous version moved the caching into >>>>>>>> map/unmap_sg, which resurrected the locking hydra. >>>>>>>> >>>>>>>> I think we're going a bit in circles here, and I don't have a good idea >>>>>>>> either :-/ >>>>>>>> -Daniel >>>>>>>> >>>>>>>>