Re: Phyr Starter

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 12 Jan 2022 18:37:03 +0000

On Tue, Jan 11, 2022 at 06:53:06PM -0400, Jason Gunthorpe wrote:
> IOMMU is not common in those cases, it is slow.
> 
> So you end up with 16 bytes per entry then another 24 bytes in the
> entirely redundant scatter list. That is now 40 bytes/page for typical
> HPC case, and I can't see that being OK.

Ah, I didn't realise what case you wanted to optimise for.

So, how about this ...

Since you want to get to the same destination as I do (a
16-byte-per-entry dma_addr+dma_len struct), but need to get there sooner
than "make all sg users stop using it wrongly", let's introduce a
(hopefully temporary) "struct dma_range".

But let's go further than that (which only brings us to 32 bytes per
range).  For the systems you care about which use an identity mapping,
and have sizeof(dma_addr_t) == sizeof(phys_addr_t), we can simply
point the dma_range pointer to the same memory as the phyr.  We just
have to not free it too early.  That gets us down to 16 bytes per range,
a saving of 33%.