Re: [RFC PATCH] regulator: core: Allow for a base-load per client to come from dts

Doug Anderson <dianders@xxxxxxxxxxxx> · Fri, 22 Jul 2022 16:35:48 -0700

Hi,

On Fri, Jul 22, 2022 at 11:22 AM Mark Brown <broonie@xxxxxxxxxx> wrote:
>
> On Thu, Jul 21, 2022 at 06:26:44PM -0700, Douglas Anderson wrote:
>
> > Looking through the current users of regulator_set_load(), all but one
> > or two basically follow the pattern:
>
> > 1. Get all the regulators.
> > 2. Go through each regulator and call regulator_set_load() on it with
> >    some value that's sufficient to get the regulator to run in "HPM"
> >    mode.
> > 3. Enable / disable the regulator as needed.
>
> Sure...  I guess these devices are just hoping that their idle currents
> are near enough to zero to be safe.  Having refreshed my mind about how
> all this works that's basically the assumption that Linux is making
> here, that any idle currents will be low enough to be satisified by the
> lowest power state of any practical regulator and we don't need to keep
> track of them.
>
> > Specifically it should be noted that many devices don't really have a
> > need for the low power mode of regulators. Unfortunately the current
>
> What exactly do you mean by "a need for the low power mode of
> regulators"?

Let me try to come up with some examples. I'll assume a regulator that
can go up to 10mA in LPM (low power mode) and 100mA in HPM (high power
mode). Software needs to switch between LPM and HPM manually. I don't
have great numbers, but I'll assume that the LDO is 90% power
efficient in LPM and 40% power efficient in HPM.

Example 1: Trackpad that powers up on its own when it detects activity

Let's imagine a driver for a trackpad that talks over i2c. We power up
the trackpad's main supplies and the trackpad's firmware boots up. The
trackpad goes idle and starts drawing only a little power (1 mA?) When
someone gets near the trackpad (triggering the capacitive detectors)
then the firmware on the trackpad will kick into action and start
drawing a bunch of power (20mA?) to analyze the signals. If the
trackpad decides it got a real touch then it will send an interrupt to
the SoC.

This trackpad _cannot_ use software managed LPM/HPM because it doesn't
tell the main apps processor (AP) before it starts drawing more power
and the AP is in charge of selecting HPM/LPM. If we want to use this
trackpad with this regulator we just need to force it to HPM.

In my above terminology, this driver doesn't "need" the low power mode
of the regulators since it can't take advantage of it.

Example 2: Trackpad that warns the AP before it draws power

Let's imagine changing the firmware on the trackpad in the first
example. Now if the trackpad detects activity and wants to power up it
sends an interrupt to the AP. When the AP says "OK" it starts its high
power work. The AP can keep the trackpad in LPM most of the time and
switch to HPM only when there's activity. This saves power, but was it
a good design? It really depends. If this happens very very
infrequently then it might save power. If my math is right:

Idle trackpad in HPM (1mA @ 40% efficiency): 1 / .4 = 2.5mA
Idle trackpad in LPM (1mA @ 90% efficiency): 1 / .9 1.1mA

So we save 1.4mA of idle power but at the cost of complexity and
response time for the trackpad. We also may need to spin up the CPU
more often which could blow through our power savings.

The answer here is much less clear cut. Maybe this is totally not
worth it, but for a device that's very idle, that 1.4 mA might be
worth it. If the idle power was 8mA (8 / .4 = 20mA) it could really be
worth it.

Example 3: A SPI controller

Let's imagine a SPI controller that takes 5 mA when actively
transfering data and .1mA when idle (register retention). We have
several controllers powered by the same rail that may or may not be
transferring at the same time, so we might end up needing HPM. The
controller never powers up on its own--only when we tell it to
transfer.

We could design our driver in two ways:

3a) Turn on the regulators at driver probe time and init our
registers. Turn them off at system suspend and on at system resume
(and restore our registers). When actively doing a transfer we request
our 5mA load and when we're doing with our transfer we request our
.1mA load.

3b) Right before an active transfer, turn on our regulator and program
all registers (there aren't that many since SPI is a fairly simple
protocol). After the transfer is done, turn the regulator off.

Both 3a) and 3b) work. If reprogramming the registers was a chore or
slow we might choose 3a). If we choose 3b), the driver doesn't really
need to be HPM/LPM aware. Something needs to specify its load once but
it doesn't need to actively manage the load, which I guess is what I
meant by a driver not "needing" the low power mode of the regulator. I
meant that the driver doesn't need the API to actively manage its
load.

> > state of the world means that all drivers are cluttered with tables of
> > loads and extra cruft at init time to tell the regulator framework
> > about said loads.
>
> We're only talking about a few lines of code here, and we could make a
> bulk helper to set loads if this is a serious issue.  You could even
> just add some more values in the bulk struct and have it be pure data
> for the drivers, regulator_bulk_get() could see a load and set it.  I
> think this may actually be the bulk of what's concerning you here?

Extending the bulk API would definitely be a good thing to do if we
reject my proposal. It would be easy to eliminate a lot of code across
a lot of drivers this way.

I guess the thing I don't love, though, is that to get to Dmitry's
dream of all regulators having their load specified the we have to add
regulator_set_load() calls to all sorts of drivers across the system.
Right now, at least in mainline, the regulator_set_load() calls are
almost entirely relegated to Qualcomm-related drivers and for
peripherals which are on the SoC itself.

Let's go back to the trackpad example, maybe looking specifically at
"hid-over-i2c". The driver here is fairly hardware agnostic and the
current used may not be very easy for the i2c-hid driver to specify in
code. I suppose one answer here is: use board constraints to force
this regulator to always use HPM (high power mode). That's fine, but
then if we have a way to turn this device "off" and it's on a shared
rail? Maybe then we'd want to be able to put the regulator in LPM
mode?

I guess this is really the place where being able to specify loads in
the device tree really seems like the right fit. We've got generic
(non-Qualcomm) components that perhaps are probable and handled by a
generic driver, but they're powered by these regulators where software
needs to change between HPM and LPM. It feels clean to specify the
load of the i2c-hid trackpad in a given board's device tree.

> > There are lots of ways we could combine this "base load" with a load
> > that the driver requests. We could:
> > - Add them.
> > - Take the maximum (assume that the "base" is the min).
> > - Only use the "base" load if the driver didn't request one.
>
> This feels a bit mangled, I think largely because you're using the term
> "base load" to refer to something which I *think* you want to be the
> peak load the device might cause when it's actively doing stuff which is
> just very confusing.  If this were being specified per device I'd expect
> that to be the idle rather than active current with a name like that.
> That's just naming, but it does suggest that if we were to put this in
> DT it should be more like the pinctrl equivalents and done as an array
> of currents with matching array of names.

Sorry for the bad naming. Originally I was thinking that the dts would
specify the minimum current but as I thought about it more it felt
like it made more sense to specify the current that should be used for
a dumb driver (one that doesn't know anything about loads).

Doing an array would make sense. Like:

vdda-phy-load-names = "default", "off", "..."
vdda-phy-loads = <21800>, <500>, <...>;

Where "default" would be the load by default when enabled, "off" when
disabled off (if it's a shared rail it could conceivably draw power
even after regulator_disable()). Others states could be selected
manually.

If we did something like that, we'd have to figure out how those
meshed with existing driver calls. Happy for any thoughts.

> > I have chosen the third option here. If the driver knows better then
> > it can override. Note that the driver can't override to "0". To do
> > that it would just disable the regulator.
>
> I don't like the idea of having platform constraints which we ignore,
> this goes against the general safety principal of the regulator API
> which is that we won't touch the device state unless given explicit
> permission to do so by the platform description.  The general idea is
> to ensure that we can't do anything that might damage the platform
> without approval from the system integrator.
>
> > NOTE: if we want to keep old binary dtb files working with new kernels
> > then we'd still have to keep existing crufty regulator_set_load() in
> > drivers, but hopefully we can stop adding new instances of this, and
> > perhaps eventually people will agree to deprecate old binary dtb files
> > and we can get rid of all the extra code.
>
> To be perfectly honest I'm not sure I see any great advantage here.
> It's moving the performance numbers from the driver into DT which makes
> them ABI which makes them harder to update, and the peak load is going
> to be vulnerable to changes due to driver changes - if we do some
> performance work and manage to start driving a device harder the numbers
> specified in the DT may become wrong.  In general it seems safer to not
> put things into ABI if we don't have to, and a substantial proportion of
> things that might use regulators are off-SoC devices which don't
> generally want or need DT fragments to describe them so we end up having
> to duplicate things in multiple DTs which isn't ideal.

Yeah, I know. Putting things in dts certainly has its downsides. The
argument you make can be taken to the extreme, of course. We can take
everything out of dts files and just specify a top-level board
compatible and everything else could be a table in source code (or,
here's an idea with extra flexibility, a board.c file). Now you never
worry about breaking ABI at all. ;-)

All snarkiness aside, though, I'm just trying to point out that
there's no definitive line to cross. No black and white. Anything we
put in dts from "assigned-clock-rates" to voltage ranges to whatever
could need to be updated as we discover more data. It is why I've also
said that for modern SoCs, I'll believe it's safe to ship device trees
separate from the kernel when I see someone add the "device tree fixup
stage" to the kernel boot flow.

> What's big push to put this in DT rather than just make the code pattern
> easier to use?  I'm not totally against it but I'm having a hard time
> getting enthusiastic about it.

It's nothing make or break and mostly it's just been kicking around in
the back of my head for the last few years. Mostly I gave a shot at
implementing something in response to your comment [1] where you said:
"You could add a way to specify constant base loads in DT on either a
per regulator or per consumer basis." ;-)

[1] https://lore.kernel.org/r/Ytk2dxEC2n/ffNpD@xxxxxxxxxxxxx

> I think the underlying thing here (and the reason you're seeing this a
> lot with Qualcomm device specifically) is that Qualcomm have a firmware
> interface to specify loads and wanted to implement selection of a low
> power mode when things are idle in the way that Linux does with the
> get_optimum_mode() interface.  Either they actually have some systems
> where they were able to select different modes when things are running
> or they don't have the thing we have where they discount loads from
> consumers that have removed their enable vote, either way for those
> platforms what's going on currently does seem like a sensible way of
> setting things up.

I guess I'm a tad less charitable than you are. From what I've seen of
Qualcomm, there seems to be an in-house checklist that they have to go
through before landing code. Drivers have to pass the checklist. I
think it contains things like:

* All memory reads and writes must be the "relaxed" variant unless you
can strongly justify the non-relaxed one. It doesn't matter if it's in
performance critical code. The relaxed version is better, even if it
is more likely to cause bugs.

* Thou shalt always set a pin's drive strength to exactly 2 if it
doesn't have a reason to be otherwise. It doesn't matter if the pin is
an input and the drive strength has no effect. Someone said that 2 is
safest so always set it to 2.

* All regulators should be configured to be able to run in high power
mode and low power mode. You should always configure a load on them.
It doesn't matter if you never use low power mode in practice.

> Other platforms don't bother, I think that's likely to be some
> combination of not caring about anything finer grained than system
> suspend (which especially for Android is totally fair) and modern
> regulators being much more able to to dynamically adapt to load than
> those the code was written for.

Yeah. I think this comes up more in practice with Qualcomm because
they do their own PMICs and every one of their LDOs has this setting
for LPM / HPM. Since the setting is there I guess they decided they
should write software to use it. I agree that it can save some power
when used well, but I personally don't think the complexity is
justified except in a small number of cases.

I would hazard a guess that perhaps Qualcomm's LDOs all have this
feature because it's probably super critical for LTE modems and they
might as well add the feature for all the LDOs if they're adding it
for some of them.

-Doug