On Thu, Mar 20, 2025 at 05:08:46PM +0100, Petr Tesarik wrote: > Mark Brown <broonie@xxxxxxxxxx> wrote: > > On Thu, Mar 20, 2025 at 03:35:36PM +0100, Petr Tesarik wrote: > > > CC'ing Robin Murphy, because there seem to be some doubts about DMA API > > > efficiency. > > Or possibly just documentation, the number of memory types we have to > > deal with and disjoint interfaces makes all this stuff pretty miserable. > I have to agree here. Plus the existing documentation is confusing, as > it introduces some opaque terms: streaming, consistent, coherent ... > what next? > I volunteer to clean it up a bit. Or at least to give it a try. That would be amazing. > If we want to make life easier for authors who don't need to squeeze > the last bit of performance from their driver, the core DMA API could be > extended with a wrapper function that checks DMA-ability of a buffer > address and takes the appropriate action. I kind of like the idea, but > I'm not a subsystem maintainer, so my opinion doesn't mean much. ;-) That sounds sensible. There's the dance that spi_{map,unmap}_buf() is doing which feels like it should be more generic, a general "I have this buffer, make it DMAable" which sounds like the same sort of ballpark and I always thought could be usefully factored out but never got round to finding a home for. > > > I still believe the SPI subsystem should not try to be clever. The > > > DMA API already avoids unnecessary copying as much as possible. > > It's not particularly trying to be clever here? > Well, it tries to guess whether the lower layer will have to make a > copy, but it does not always get it right (e.g. memory encryption). > Besides, txbuf and rxbuf might be used without any copying at all, e.g. > if they point to direct-mapped virtual addresses (e.g. returned by > kmalloc). > At the end of the day, it's no big deal, because SPI transfers are > usually small and not performance-critical. I wouldn't be bothered > myself if it wasn't part of a larger project (getting rid of DMA zones). Some of the IIO users might beg to differ about performance criticality and transfer sizes there, and there's things like firmware download and SPI flashes too. A lot of the performance work on the subsystem came from people with CAN controllers they're trying to saturate, some of which was large messages. It's not the same situation as block devices or networking but it's an area where anything we can do to eliminate dead time on the bus can be really noticable to applications. It gets used a lot with mixed signal applications where implementing digital logic is expensive but you might want to get a lot of data in or out.
Attachment:
signature.asc
Description: PGP signature