Hi Doug, On Fri, Sep 22, 2023 at 05:11:10PM -0700, Doug Anderson wrote: > Hi, > > On Fri, Sep 22, 2023 at 12:08 PM Rob Herring <robh+dt@xxxxxxxxxx> wrote: > > > > > > This seems like overkill to me. Do we really need groups and a mutex > > > > for each group? Worst case is what? 2-3 groups of 2-3 devices? > > > > Instead, what about extending "status" with another value > > > > ("fail-needs-probe"? (fail-xxx is a documented value)). Currently, the > > > > kernel would just ignore nodes with that status. Then we can process > > > > those nodes separately 1-by-1. > > > > > > My worry here is that this has the potential to impact boot speed in a > > > non-trivial way. While trackpads and touchscreens _are_ probable, > > > their probe routines are often quite slow. This is even mentioned in > > > Dmitry's initial patches adding async probe to the kernel. See commit > > > 765230b5f084 ("driver-core: add asynchronous probing support for > > > drivers") where he specifically brings up input devices as examples. Ideally, all but one driver in a group should bail out of probe quickly if the device is not populated. If not, I would consider that to be a bug or at least room for improvement in that driver. The reason input devices can take a while to probe is because they may be loading FW over I2C or performing some sort of calibration procedure; only one driver in the group should get that far. > > > > Perhaps then this should be solved in userspace where it can learn > > which device is actually present and save that information for > > subsequent boots. > > Yeah, the thought occurred to me as well. I think there are a few > problems, though: > > a) Userspace can't itself probe these devices effectively. While > userspace could turn on GPIOs manually and query the i2c bus manually, > it can't (I believe) turn on regulators nor can it turn on clocks, if > they are needed. About the best userspace could do would be to blindly > try binding an existing kernel driver, and in that case why did we > need userspace involved anyway? > > b) While deferring to userspace can work for solutions like ChromeOS > or Android where it's easy to ensure the userspace bits are there, > it's less appealing as a general solution. I think in Johan's case > he's taking a laptop that initially ran Windows and then is trying to > run a generic Linux distro on it. For anyone in a similar situation, > they'd either need to pick a Linux distro that has the magic userspace > bits that are needed or they need to know that, on their laptop, they > need to manually install some software. While requiring special > userspace might make sense if you've got a special peripheral, like an > LTE modem, it makes less sense to need special userspace just to get > the right devices bound... I recommend against spilling the solution into user space; it's simply not practical for many downstream use-cases where platform engineers can tightly control their own bootloader and kernel, but have limited maintainership of user space which is likely to be shared by many other products. > > > > > It wouldn't be absurd to have a system that has multiple sources for > > > both the trackpad and the touchscreen. If we have to probe each of > > > these one at a time then it could be slow. It would be quicker to be > > > able to probe the trackpads (one at a time) at the same time we're > > > probing the touchscreens (one at a time). Using the "fail-needs-probe" > > > doesn't provide information needed to know which devices conflict with > > > each other. > > > > I would guess most of the time that's pretty evident. They are going > > to be on the same bus/link. If unrelated devices are on the same bus, > > then that's going to get serialized anyways (if bus accesses are what > > make things slow). Agreed with Rob here; in the case of a touchscreen, we're almost always dealing with a module that connects to the main board by way of a flex connector. Rarely do the bus and GPIO assignments change across alternates. > > > > We could add information on the class of device. touchscreen and > > touchpad aliases or something. > > Ah, I see. So something like "fail-needs-probe-<class>". The > touchscreens could have "fail-needs-probe-touchscreen" and the > trackpads could have "fail-needs-probe-trackpad" ? That could work. In > theory that could fall back to the same solution of grabbing a mutex > based on the group ID... > > Also: if having the mutex in the "struct device" is seen as a bad > idea, it would also be easy to remove. __driver_probe_device() could > just make a call like "of_device_probe_start()" at the beginning that > locks the mutex and then "of_device_probe_end()" that unlocks it. Both > of those calls could easily lookup the mutex in a list, which would > get rid of the need to store it in the "struct device". > > > > > That would lead me to suggest this: > > > > > > &i2c_bus { > > > trackpad-prober { > > > compatible = "mt8173-elm-hana-trackpad-prober"; > > > > > > tp1: trackpad@10 { > > > compatible = "hid-over-i2c"; > > > reg = <0x10>; > > > ... > > > post-power-on-delay-ms = <200>; > > > }; > > > tp2: trackpad@20 { > > > compatible = "hid-over-i2c"; > > > reg = <0x20>; > > > ... > > > post-power-on-delay-ms = <200>; > > > }; > > > }; > > > }; > > > > > > ...but I suspect that would be insta-NAKed because it's creating a > > > completely virtual device ("mt8173-elm-hana-trackpad-prober") in the > > > device tree. I don't know if there's something that's functionally > > > similar that would be OK? This solution seems a bit confusing to me, and would require more edits to the dts each time a second source is added. It also means one would have to write a small platform driver for each group of devices, correct? I like the idea of a new "status" string; just my $.02. > > > > Why do you need the intermediate node other than a convenient way to > > instantiate a driver? You just need a flag in each node which needs > > this special handling. Again, "status" could work well here since it > > keeps the normal probe from happening. But I'm not saying you can't > > have some board specific code. Sometimes you just need code to deal > > with this stuff. Don't try to parameterize everything to DT > > properties. > > I think I'd have an easier time understanding if I knew where you > envisioned the board-specific code living. Do you have an example of > board specific code running at boot time in the kernel on DT systems? > > > > Note that the above only works with "generic" compatibles with > > "generic" power sequencing properties (I won't repeat my dislike > > again). > > I don't think so? I was imagining that we'd have some board specific > code that ran that knew all the possible combinations of devices, > could probe them, and then could instantiate the correct driver. > > Imagine that instead of the hated "hid-over-i2c" compatible we were > using two other devices. Imagine that a given board could have a > "elan,ekth6915" and a "goodix,gt7375p". Both of these devices have > specific timing requirements on how to sequence their supplies and > reset GPIOs. For Elan we power on the supplies, wait at least 1 ms, > deassert reset, wait at least 300 ms, and then can talk i2c. For > Goodix we power on the supply, wait at least 10 ms, deassert reset, > wait at least 180 ms, and then can talk i2c. If we had a > board-specific probing driver then it would power on the supplies, > wait at least 10 ms (the max of the two), deassert reset, wait at > least 300 ms (the max of the two), and then see which device talked. > Then it would instantiate whichever of the two drivers. This could be > done for any two devices that EEs have determined have "compatible" > probing sequences. > > Ideally in the above situation we'd be able to avoid turning the > device off and on again between the board-specific probe code and the > normal driver. That optimization might need special code per-driver > but it feels doable by passing some sort of hint to the child driver > when it's instantiated. > > > > If only the driver knows how to handle the device, then you > > still just have to have the driver probe. If you *only* wanted to > > solve the above case, I'd just make "hid-over-i2c" take a 2nd (and > > 3rd) I2C address in reg and have those as fallbacks. > > Yeah, it did occur to me that having "hid-over-i2c" take more than one > register (and I guess more than one "hid-descr-addr") would work in my > earlier example and this might actually be a good solution for Johan. > I'm hoping for a better generic solution, though. > > > > You could always make the driver probe smarter where if your supply > > was already powered on, then don't delay. Then something else could > > ensure that the supply is enabled. I'm not sure if regulators have the > > same issue as clocks where the clock might be on from the bootloader, > > then a failed probe which gets then puts the clock turns it off. > > I'm not sure it's that simple. Even if the supply didn't turn off by > itself in some cases, we wouldn't know how long the supply was on. > > -Doug Thanks for charting this path; I'm really excited to see a solution to this common problem. Kind regards, Jeff LaBundy