On Mon, Mar 7, 2022 at 3:03 PM Robin Murphy <robin.murphy@xxxxxxx> wrote: > > On 2022-03-07 12:47, Gilad Ben-Yossef wrote: > > On Mon, Mar 7, 2022 at 2:36 PM Robin Murphy <robin.murphy@xxxxxxx> wrote: > >> > >> On 2022-03-07 12:17, Gilad Ben-Yossef wrote: > >>> On Mon, Mar 7, 2022 at 1:14 PM Robin Murphy <robin.murphy@xxxxxxx> wrote: > >>> > >>>> The "overlap" is in the sense of having more than one mapping within the > >>>> same cacheline: > >>>> > >>>> [ 142.458120] DMA-API: add_dma_entry start P=ba79f200 N=ba79f > >>>> D=ba79f200 L=10 DMA_FROM_DEVICE attrs=0 > >>>> [ 142.458156] DMA-API: add_dma_entry start P=445dc010 N=445dc > >>>> D=445dc010 L=10 DMA_TO_DEVICE attrs=0 > >>>> [ 142.458178] sun8i-ss 1c15000.crypto: SRC 0/1/1 445dc000 len=16 bi=0 > >>>> [ 142.458215] sun8i-ss 1c15000.crypto: DST 0/1/1 ba79f200 len=16 bi=0 > >>>> [ 142.458234] DMA-API: add_dma_entry start P=ba79f210 N=ba79f > >>>> D=ba79f210 L=10 DMA_FROM_DEVICE attrs=0 > >>>> > >>>> This actually illustrates exactly the reason why this is unsupportable. > >>>> ba79f200 is mapped for DMA_FROM_DEVICE, therefore subsequently mapping > >>>> ba79f210 for DMA_TO_DEVICE may cause the cacheline covering the range > >>>> ba79f200-ba79f23f to be written back over the top of data that the > >>>> device has already started to write to memory. Hello data corruption. > >>>> > >>>> Separate DMA mappings should be from separate memory allocations, > >>>> respecting ARCH_DMA_MINALIGN. > >>> > >>> hmm... I know I'm missing something here, but how does this align with > >>> the following from active_cacheline_insert() in kernel/dma/debug.c ? > >>> > >>> /* If the device is not writing memory then we don't have any > >>> * concerns about the cpu consuming stale data. This mitigates > >>> * legitimate usages of overlapping mappings. > >>> */ > >>> if (entry->direction == DMA_TO_DEVICE) > >>> return 0; > >> > >> It's OK to have multiple mappings that are *all* DMA_TO_DEVICE, which > >> looks to be the case that this check was intended to allow. However I > >> think you're right that it should still actually check for conflicting > >> directions between the new entry and any existing ones, otherwise it > >> ends up a bit too lenient. > >> > >> Cheers, > >> Robin. > > > > I understand what you are saying about why checking for conflicting > > directions may be a good thing, but given that the code is as it is > > right now, how are we seeing the warning for two mapping that one of > > them is DMA_TO_DEVICE? > > Because it's the second one that isn't. The warning is triggered by > adding the DMA_FROM_DEVICE entry, which *is* checked, and finds the > DMA_TO_DEVICE entry already present. What's not great is that if those > two mappings happened to be made in the opposite order then it would be > missed entirely. Please accept my sincere apologies if I'm being daft , but here is the code for add_dma_entry(): static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs) { struct hash_bucket *bucket; unsigned long flags; int rc; bucket = get_hash_bucket(entry, &flags); hash_bucket_add(bucket, entry); put_hash_bucket(bucket, flags); rc = active_cacheline_insert(entry); if (rc == -ENOMEM) { pr_err("cacheline tracking ENOMEM, dma-debug disabled\n"); global_disable = true; } else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) { err_printk(entry->dev, entry, "cacheline tracking EEXIST, overlapping mappings aren't supported\n"); } } Clearly we get to active_cacheline_insert() unconditionally. Here is the code of active_cacheline_insert(): static int active_cacheline_insert(struct dma_debug_entry *entry) { phys_addr_t cln = to_cacheline_number(entry); unsigned long flags; int rc; /* If the device is not writing memory then we don't have any * concerns about the cpu consuming stale data. This mitigates * legitimate usages of overlapping mappings. */ if (entry->direction == DMA_TO_DEVICE) return 0; spin_lock_irqsave(&radix_lock, flags); rc = radix_tree_insert(&dma_active_cacheline, cln, entry); if (rc == -EEXIST) active_cacheline_inc_overlap(cln); spin_unlock_irqrestore(&radix_lock, flags); return rc; } Clearly the check for direction happens BEFORE we ever attempt to add the cacheline tracking data. So it shouldn't matter at all which is first and which is second... :-( I know I'm missing something. But what? Thanks, Gilad -- Gilad Ben-Yossef Chief Coffee Drinker values of β will give rise to dom!