On Mon, May 13, 2019 at 09:21:40AM +0200, kernel@xxxxxxxxxxxxxxxx wrote: > > On 12.05.2019, at 10:54, Mark Brown <broonie@xxxxxxxxxx> wrote: > > On Thu, May 09, 2019 at 09:47:08PM +0200, Martin Sperl wrote: > > IIRC the mapping is deliberately done late in order to minimize the > > amount of time we're consuming resources for the mapping, there were > > some systems that had limited DMA channels. However I don't know how > > big a concern that is in this day and age with even relatively old > > systems. > We may be able to make the mapping early or late. > The place where it REALLY makes a difference is when we are running > in the Pump (because of async or because of multiple threads writing > to the same spi bus via spi_sync) Yeah, it's definitely not a common case issue. > > For the case where we do have the message pump going one thing it'd be > > good to do is overlap more of the admin work around the messages with > > other transfers - ideally we'd be able to kick off the next transfer > > from within the completion of a DMA. I need to have a dig around and > > figure out if I have any hardware that can actually support that, last > > time I looked at this my main system needed to kick everything up to the > > thread due to hardware requirements. > But to get all this done I fear it will definitely require api changes > and thus a new kind of pump. Yes, it'd need new APIs - probably along the same lines as the prior ones where we provide finer grained ops. The stuff with finalize_current_message() was working towards that. > Maybe the pump can get shared by multiple spi (master) controller. This > would help when there are say 4 devices connected each to a separate > controller and then transferring short messages that would get handled > by polling - that would mean 4 CPUs just polling, which is also consuming > lots of cpu cycles. If we could pool this polling But then we can't have all the cores in our multi-core system polling so everything will be slower! > I am starting to wonder if there is a means to make the wakeup of threads > fast/priority to keep the latency on spi_sync minimal - essentially > yielding the CPU to the “right” thread (so making a yield cheap). > But let us see how far we get before we can tackle this... > Form a performance/thru-put perspective I guess it may be relevant to > extend the spi_test framework to also gather performance/latency > statistics so that we have a means to compare actual performance > numbers and avoid regressions. For that sort of instrumentation ftrace/perf is probably already going to be more than enough, just needs a bit of setup and learning curve with the tooling.
Attachment:
signature.asc
Description: PGP signature