On 9/5/23 04:43, Tomas Melin wrote:
Hi,
On 04/09/2023 16:12, Jonathan Cameron wrote:
On Mon, 4 Sep 2023 14:23:29 +0300
Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
On Mon, Sep 04, 2023 at 01:15:22PM +0300, Tomas Melin wrote:
Support deferred probe for cases where communication on
i2c bus fails. These failures could happen for a variety of
reasons including bus arbitration error or power failure.
+out:
+ if ((ret == -EAGAIN) || (ret == -ENXIO))
+ return -EPROBE_DEFER;
+ return ret;
Oh my... This looks so-o hackish.
Agreed. This is a non starter.
If anything, it has to be fixed on the level of regmap I2C APIs or so.
Maybe something like regmap_i2c_try_write()/try_read() new APIs that
will provide the above. Otherwise you want to fix _every single driver_
in the Linux kernel
Any probe ordering dependencies should be described by the
firmware and the driver should 'get' the relevant resource.
If there is anything not describable today then that is what we need
to fix, not paper over the holes
So can we have specifics of what is happening here?
If it's arbitration with some other entity then fix the arbitration
locking / hand over. If it's power, then make sure the relevant
regulator get gotten and turned on + has the right delays etc.
Yes, right. In this use case, the ads1015 is connected to a channel of
a i2c multiplexer. When the mux is probed, it also enumerates all the
multiplexed buses and probes devices connected to them.
For some reason, it behaves so that the ads1015 is not detected on the
first attempt. Since it's a mux, connected to main i2c line, perhaps
there really is some bus arbitration issue, or then something else.
Anyways, when deferring the probe for the ads1015, and attempting later
again it probes fine.
So, it might be I've taken the wrong angle at this issue, but
it does solve the issue at hand. Obviously, there could be some issue
with the i2c mux driver, or then on hardware level too.
Point is, that if the communication to the i2c bus has some temporary
error like EAGAIN, why could it not be reasonable to try again at a
later time instead of giving up completely.
The way probe deferral works, or is supposed to work, is that if a
driver detects that it is missing a resource to initialize the device it
can return EPROBE_DEFER to try again later. Once a new resource becomes
available it will try again. In your case there is no resource
dependency, but just a random failure. So there is no guarantee that
probe will actually be called again since there might not be any new
resources that become available.
The solution you've implemented might work on your specific platform,
but it does not work by design, it only works by chance. Returning
EPROBE_DEFER for things like IO errors is not the right approach. If you
need a quick hack you can for example write a small userspace script
that will trigger re-probe of the device at system startup.