Re: [PATCH v1 0/3] introduce priority-based shutdown support

Ferry Toth <fntoth@xxxxxxxxx> · Sun, 26 Nov 2023 20:42:02 +0100

Ha ha,

Funny discussion. As a hardware engineer (with no experience in 
automotive, but actual experience in industrial applications and 
debugging issues arising from bad shutdowns) let me add my 5ct at the end.

Op 26-11-2023 om 11:14 schreef Mark Brown:
On Sat, Nov 25, 2023 at 07:58:12PM +0000, Greg Kroah-Hartman wrote:
On Sat, Nov 25, 2023 at 03:43:02PM +0000, Mark Brown wrote:
On Sat, Nov 25, 2023 at 02:35:41PM +0000, Greg Kroah-Hartman wrote:

That would be great, but I don't see that here, do you?  All I see is
the shutdown sequence changing because someone wants it to go "faster"
with the threat of hardware breaking if we don't meet that "faster"
number, yet no knowledge or guarantee that this number can ever be known
or happen.

The idea was to have somewhere to send notifications when the hardware
starts reporting things like power supplies starting to fail.  We do
have those from hardware, we just don't do anything terribly useful
with them yet.

Ok, but that's not what I recall this patchset doing, or did I missing
something?  All I saw was a "reorder the shutdown sequence" set of
changes.  Or at least that's all I remember at this point in time,
sorry, it's been a few days, but at least that lines up with what the
Subject line says above :)

That's not in the series, a bunch of it is merged in some form (eg, see
hw_protection_shutdown()) and more of it would need to be built on top
if this were merged.

Agreed, but I don't think this patch is going to actually work properly
over time as there is no time values involved :)

This seems to be more into the area of mitigation than firm solution, I
suspect users will be pleased if they can make a noticable dent in the
number of failures they're seeing.

Mitigation is good, but this patch series is just a hack by doing "throw
this device type at the front of the shutdown list because we have
hardware that crashes a lot" :)

Sounds like a mitigation to me.

It feels like if we're concerned about mitigating physical damage during
the process of power failure that's a very limited set of devices - the
storage case where we're in the middle of writing to flash or whatever
is the most obvious case.

Then why isn't userspace handling this?  This is a policy decision that
it needs to take to properly know what hardware needs to be shut down,
and what needs to happen in order to do that (i.e. flush, unmount,
etc.?)  And userspace today should be able to say, "power down this
device now!" for any device in the system based on the sysfs device
tree, or at the very least, force it to a specific power state.  So why
not handle this policy there?

Given the tight timelines it does seem reasonable to have some of this
in the kernel - the specific decisions about how to handle these events
can always be controlled from userspace (eg, with a sysfs file like we
do for autosuspend delay times which seem to be in a similar ballpark).

I'd prefer not to call the HW broken in this case. The life of hardware 
(unlike software) continues during and after power down. That means 
there may be requirements and specs for it to conform to during those 
transitions and states. Unlike broken hardware, which does not conform 
to its specs. Typically, a HDD that autoparks its heads to a safe 
position on its last rotation energy, that's not broken, that's 
carefully designed.

That said, I agree with Greg, if there is a hard requirement to shutdown 
safely to prevent damage, the solution is not to shutdown fast. The 
solution is to shutdown on time.

In fact, if the software needs more energy to shutdown safely, any 
hardware engineer will consider that a requirement. And ask the 
appropriate question: "how much energy do you need exactly?". There are 
various reasons why that can not be answered in general. The most funny 
answer I ever got (thanks Albert) being: "My software doesn't consume 
energy".
Now, we do need to keep in mind that storing J in a supercap, executing 
a CPU at GHz, storing GB data do not come free. So, after making sure 
things shutdown in time, it often pays off to shorten that deadline, and 
indeed make it faster.

Looking at the above discussion from the different angles:
1) The hardware mentioned does not need to shutdown (as said, it doesn't 
need to be unmounted). It needs to be placed into a safe state on time. 
And the only thing here that can know for the particular hardware what 
is a safe state, is the driver itself.
2) To get a signal (Low Power Warning) to the driver on time, the 
PREEMPT_RT kernel seems like a natural choice.
3) To me (but hey who am I) it makes sense to have a generic mechanism 
from drivers to transition to their safe state if they require that.
4) I wouldn't worry about drivers fighting for priority, these systems 
are normally "embedded" with fixed hardware. Otherwise there is no way 
to calculate shutdown energy required and do proper hardware design.