Hi Mark! > On 12.05.2019, at 10:54, Mark Brown <broonie@xxxxxxxxxx> wrote: > > On Thu, May 09, 2019 at 09:47:08PM +0200, Martin Sperl wrote: > >> While thinking about this again maybe an idea: >> What about implement a second spi_transfer_one implementation (together >> with a message pump implementation) that would handle things correctly. > >> Any driver then can select the old (default) or new implementation and thus >> would allow the optimizations to take place only for verified working drivers... > > I'd rather avoid having yet another interface for drivers to use, people > already get confused trying to choose between the ones we already have. > It'd have to be something where the existing drivers got actively > converted and the old interface retired rather than something that hangs > around. I totally understand that. > >> What I would then also like to do for the new implementation is modify the >> API a bit - ideally I would like to: >> * Make spi_sync the primary interface which the message pump is also >> using directly >> * move all the prepare stuff early into spi-sync, so that for example the >> Preparing (including dma mapping) would get done in the calling thread >> And only the prepared message would get submitted to the queue >> - special processing would be needed for the spi-async case. > > IIRC the mapping is deliberately done late in order to minimize the > amount of time we're consuming resources for the mapping, there were > some systems that had limited DMA channels. However I don't know how > big a concern that is in this day and age with even relatively old > systems. We may be able to make the mapping early or late. The place where it REALLY makes a difference is when we are running in the Pump (because of async or because of multiple threads writing to the same spi bus via spi_sync) > The idea of spi_async() having a separate path also makes me a > bit nervous as it's much less widely used so more likely to get broken > accidentially. I would try to come up with something, > Otherwise pushing things out to the caller makes sense, it should have > no real impact in the majority of cases where the thread is just getting > used to idle the controller and the actual work is all happening in the > calling context anyway and if the pump is being used it means it's > spending more time actually pushing transfers out. > For the case where we do have the message pump going one thing it'd be > good to do is overlap more of the admin work around the messages with > other transfers - ideally we'd be able to kick off the next transfer > from within the completion of a DMA. I need to have a dig around and > figure out if I have any hardware that can actually support that, last > time I looked at this my main system needed to kick everything up to the > thread due to hardware requirements. But to get all this done I fear it will definitely require api changes and thus a new kind of pump. Maybe the pump can get shared by multiple spi (master) controller. This would help when there are say 4 devices connected each to a separate controller and then transferring short messages that would get handled by polling - that would mean 4 CPUs just polling, which is also consuming lots of cpu cycles. If we could pool this polling I am starting to wonder if there is a means to make the wakeup of threads fast/priority to keep the latency on spi_sync minimal - essentially yielding the CPU to the “right” thread (so making a yield cheap). But let us see how far we get before we can tackle this... Form a performance/thru-put perspective I guess it may be relevant to extend the spi_test framework to also gather performance/latency statistics so that we have a means to compare actual performance numbers and avoid regressions. Martin