On Fri, 3 Nov 2023 21:50:53 +0100 Petr Tesařík <petr@xxxxxxxxxxx> wrote: > > In our opinion the first step towards getting this right is to figure out what > > the different kinds of alignments are really supposed to mean. For each of the > > mechanisms we need to understand and document, whether making sure that the > > bounce buffer does not stretch over more of certain units of memory (like, > > pages, iova granule size, whatever), or is it about preserving offset within a > > certain unit of memory, and if yes to what extent (the least significant n-bits > > of the orig_addr dictated by the respective mask, or something different). > > > Seconded. I have also been struggling with the various alignment > constraints. I have even written (but not yet submitted) a patch to > calculate the combined alignment mask in swiotlb_tbl_map_single() and > pass it down to all other functions, just to make it clear what > alignment mask is used. Can you cc me when posting that rework? > > My understanding is that buffer alignment may be required by: > > 1. hardware which cannot handle an unaligned base address (presumably > because the chip performs a simple OR operation to get the addresses > of individual fields); I'm not sure I understood this properly. What is "base address" in this context? Is for swiotlb "base address" basically the return value of swiotlb_tbl_map_single() -- I referred to this as tlb_addr previously? If yes, I understand that satisfying 1 means satisfying tlb_addr & combined_mask == 0, where combined_mask describes the combined alignment requirement (i.e. combined_mask == min_align_mask | alloc_align_mask | (alloc_size < PAGE_SIZE ? 0 : (PAGE_SIZE-1)). Does that sound about right? Can we assume that if 1. then the start address of the mapping that is orig_addr needs to be already aligned? > > 2. isolation of untrusted devices, where no two bounce buffers should > end up in the same iova granule; > > 3. allocation size; I could not find an explanation, so this might be > merely an attempt at reducing SWIOTLB internal fragmentation. Assumed I understood 1 correctly, I think we are missing something. 4. preserve the n (0 <= n <= 31) lowest bits of all addresses within the mapping. Was it just 1, 2 and 3 then we wouldn't need the whole offsetting business introduced by commit 1f221a0d0dbf ("swiotlb: respect min_align_mask"). Let me cite from its commit message: """ Respect the min_align_mask in struct device_dma_parameters in swiotlb. There are two parts to it: 1) for the lower bits of the alignment inside the io tlb slot, just extent the size of the allocation and leave the start of the slot empty 2) for the high bits ensure we find a slot that matches the high bits of the alignment to avoid wasting too much memory Based on an earlier patch from Jianxiong Gao <jxgao@xxxxxxxxxx>. """ Do we agree, that 4. needs to be added to the list? Or was it supposed to be covered by 1.? AFAIU 4. is about either (tlb_addr & combined_mask) == (orig_addr & combined_mask) or about (tlb_addr & min_aling_mask) == (orig_addr & min_align_mask). And I would like to know which of the two options it is. Cc-ing Jianxiong. > > I hope other people on the Cc list can shed more light on the intended > behaviour.