Re: Regression: spi: core: avoid waking pump thread from spi_sync instead run teardown delayed

Mark Brown <broonie@xxxxxxxxxx> · Mon, 13 May 2019 18:36:32 +0100

On Mon, May 13, 2019 at 09:21:40AM +0200, kernel@xxxxxxxxxxxxxxxx wrote:
> > On 12.05.2019, at 10:54, Mark Brown <broonie@xxxxxxxxxx> wrote:
> > On Thu, May 09, 2019 at 09:47:08PM +0200, Martin Sperl wrote:

> > IIRC the mapping is deliberately done late in order to minimize the
> > amount of time we're consuming resources for the mapping, there were
> > some systems that had limited DMA channels.  However I don't know how
> > big a concern that is in this day and age with even relatively old
> > systems.

> We may be able to make the mapping early or late.

> The place where it REALLY makes a difference is when we are running
> in the Pump (because of async or because of multiple threads writing
> to the same spi bus via spi_sync)

Yeah, it's definitely not a common case issue.

> > For the case where we do have the message pump going one thing it'd be
> > good to do is overlap more of the admin work around the messages with
> > other transfers - ideally we'd be able to kick off the next transfer
> > from within the completion of a DMA.  I need to have a dig around and
> > figure out if I have any hardware that can actually support that, last
> > time I looked at this my main system needed to kick everything up to the
> > thread due to hardware requirements.

> But to get all this done I fear it will definitely require api changes
> and thus a new kind of pump.

Yes, it'd need new APIs - probably along the same lines as the prior
ones where we provide finer grained ops.  The stuff with
finalize_current_message() was working towards that.

> Maybe the pump can get shared by multiple spi (master) controller. This
> would help when there are say 4 devices connected each to a separate
> controller and then transferring short messages that would get handled
> by polling - that would mean 4 CPUs just polling, which is also consuming
> lots of cpu cycles. If we could pool this polling

But then we can't have all the cores in our multi-core system polling so
everything will be slower!

> I am starting to wonder if there is a means to make the wakeup of threads 
> fast/priority to keep the latency on spi_sync minimal - essentially
> yielding the CPU to the “right” thread (so making a yield cheap).

> But let us see how far we get before we can tackle this...

> Form a performance/thru-put perspective I guess it may be relevant to
> extend the spi_test framework to also gather performance/latency
> statistics so that we have a means to compare actual performance
> numbers and avoid regressions.

For that sort of instrumentation ftrace/perf is probably already going
to be more than enough, just needs a bit of setup and learning curve
with the tooling.
Attachment:
signature.asc

Description: PGP signature