On Wed, Mar 08, 2023 at 07:39:03AM -0800, Doug Anderson wrote: > Hi, > > On Fri, Mar 3, 2023 at 4:53 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > > IMO better would be to say something like when sync_state=strict that > > > you'll just leave resources in a high power state > > > > But this statement is not true either. Just because a device driver > > has a sync_state() doesn't mean the device was left in a powered on > > state by the bootloader. > > Though I guess it's theoretically possible that a device using > sync_state will leave resources in a _lower_ power state until > sync_state is reached, I'm skeptical if that actually happens. Can you > point to any examples? The sync state docs > "sysfs-devices-state_synced" actually document that the common case is > when the bootloader left a resource enabled and we won't disable the > resource until sync_state is reached. That's almost certainly a higher > power state. > > I would also point to one of the users of sync_state: the interconnect > framework. Take a look at commit b1d681d8d324 ("interconnect: Add sync > state support"). You can see that in icc_node_add() if we can't read > the bandwidth at bootup we end up at the max (INT_MAX). That's exactly > the case we actually hit for Qualcomm. It's not that we just avoid > touching the resources until sync state is reached--we actually max it > out. Another example is commit 3a39049f88e4 ("soc: qcom: rpmhpd: Use highest corner until sync_state"), which does the same for rpmhpds. > In general, something feels a bit awkward here in defining this as > "however the bootloader left it". That concept makes sense for things > where we need to manage a handoff from the bootloader for the kernel, > but it's not the answer for all things. The bootloader's job is to > boot the system and get out of the way, not to init all resources. It > only inits resources that it cares about. That means if the bootloader > displays a splash screen then it might init resources for the display. > if it doesn't display a splash screen it might not. The kernel needs > to handle either case. > > In general, the problems being solved with sync_state seem to require > resources to be left on and in high power until sync state is reached. > Today, you define that as "the state the bootloader left it in". > ...but if the bootloader didn't leave it in a high power state then > you'd need to change this definition. > > If you truly want to couch the verbiage, I guess I'd be OK with saying > "when sync_state=strict that you'll _LIKELY_ leave resources in a high > power state if sync_state is never reached" > > > > > While I don't object to this being a kernel command line flag, the > > > default should also be a Kconfig option. The kernel command line is > > > not a great place for general configuration. As we jam too much stuff > > > in the kernel command line it gets unwieldy quickly. IMO: > > > > > > * Kconfig: the right place for stuff for config options that a person > > > building the kernel might want to tweak. > > > > > > * Kernel command line: the right place for a user of a pre-built > > > kernel to tweak; also (sometimes) the right place for the bootloader > > > to pass info to the kernel; also a good place for debug options that a > > > kernel engineer might want to tweak w/out rebuilding the kernel. > > > > > > In this case it makes sense for the person building the kernel to > > > choose a default that makes sense for the hardware that their kernel > > > is targetting. It can also make sense for a user of a pre-built kernel > > > to tweak this if their hardware isn't working correctly. Thus it makes > > > sense for Kconfig to choose the default and the kernel command line to > > > override. > > > > I don't mind adding a Kconfig to select the default behavior, but > > maybe as a separate patch in the future so if there's any debate about > > that, you'll at least get this option. > > I don't mind it being a separate patch, but it should be part of the > initial series. +1 > > > Specifically, I think this warning message gets printed out after > > > we've given up waiting for devices to show up. At this point > > > -EPROBE_DEFER becomes an error that we won't retry. > > > > This is not true. We will always retry on an -EPROBE_DEFER, even after timeout. > > OK, so I think this is the main point of contention here, so let's get > to the bottom of it first and then we can address anything else. > > I guess I'm trying to figure out what "deferred_probe_timeout" is > supposed to be about. From reading > driver_deferred_probe_check_state(), I see that the idea is that once > the timeout expires then we'll start returning -ETIMEDOUT when we used > to return -EPROBE_DEFER. I guess I mispoke then. You're correct that > -EPROBE_DEFER will still be retried. That being said, things that used > to be retired (because they returned -EPROBE_DEFER) will now become > permanent/non-retired errors (because they return -ETIMEDOUT). > > My point is that if we ever actually hit that case (where we return > -ETIMEDOUT instead of -EPROBE_DEFER) we really enter a state where > it's not going to be great to load any more drivers. Once a driver > failed to probe (because it got back an -ETIMEDOUT instead of > -EPROBE_DEFER) then the user needs to manually unbind/rebind the > device to retry. That's not a good state. > > So the above is the crux of my argument that once > "deferred_probe_timeout" fires that the system really isn't in good > shape to load more drivers. > > So looking more carefully, I think I can understand where you're > coming from. Specifically I note that very few subsystems have "opted > in" to the deferred_probe_timeout on ToT. I can also see that recently > you made the effort to delete driver_deferred_probe_check_state(), > though those were reverted. That means that, as it stands, devices > will _probably_ not end up with the problem I describe above (unless > they depend on a subsystem that has opted-in). ...and, if your plans > come to fruition, then eventually we'll never hit it. > > Where does that leave us? I guess I will step back on my assertion > that when the timeout fires that drivers can't load anymore. Certainly > the state that ToT Linux is in is confusing. "deferred_probe_timeout" > is still documented (in kernel-parameters.txt) to cause us to "give > up" waiting for dependencies. ...and it still causes a few subsystems > to give up. ...but I guess it mostly works. > > > > > I would perhaps also make it sound a little scarier since, > > > > I definitely don't want to make it sound scarier and get everyone to > > enable the timeout by default without actually knowing if it has a > > power impact on their system. > > > > > IMO, this > > > is a problem that really shouldn't be "shipped" if this is an embedded > > > kernel. Maybe something like: > > > > This is how it's shipped on all Android devices in the past 2 years. > > So it's not a global problem like you make it to be. > > You're saying devices _shipped_ but booted up where devices never > reached sync_state? ...and that's not a power consumption problem??? > I'm not saying that the sync_state concept couldn't ship, I'm saying > that if this printout shows up in boot logs that it's highly likely > there's a problem that needs to be fixed and that's causing extra > power consumption. That's why I want the printout to sound scarier. +1