On 11/02/22 6:56 am, Gabriel Krisman Bertazi wrote:
Bartosz Golaszewski <brgl@xxxxxxxx> writes:
My email address changed in September, that's why I didn't see the
email you sent in November to my old one.
Hi Bart,
thanks for the prompt reply and sorry for the wrong email address.
gpiod_to_irq() can be used in context other than driver probing, I'm
worried existing users would not know how to handle it. Also: how come
you can get the GPIO descriptor from the provider but its interrupts
are not yet set up?
I'm definitely some context here, as its been quite a while.
Shreeya, feel free to pitch in. :)
Existing users will probably receive -ENXIO in case to_irq is not
set and wasn't intended to be set.
We are trying to solve the race which happens frequently in cases
where I2C is set as built-in and pinctrl-amd is set as module.
There is no dependency between I2C and pinctrl-amd, while pinctrl-amd is
still trying to set the gc irq members through gpiochip_add_irqchip, I2C
calls gpiod_to_irq() which leads to returning -ENXIO since gc->to_irq is
still NULL
There have also been cases where gc->to_irq is set successfully but
other members
are yet to be initalized by gpiochip_add_irqchip like the domain
variable which is
being used in .to_irq() and ultimately leads to a NULL pointer
dereference as Gabriel
mentioned. I am working on a fix which would use mutex to not let gc irq
members
be accessed until they all have been completely initialized.
I2C calls gpiod_to_irq through the following stack trace
kernel: Call Trace:
kernel: gpiod_to_irq.cold+0x49/0x8f
kernel: acpi_dev_gpio_irq_get_by+0x113/0x1f0
kernel: i2c_acpi_get_irq+0xc0/0xd0
kernel: i2c_device_probe+0x28a/0x2a0
kernel: really_probe+0xf2/0x460
kernel: driver_probe_device+0xe8/0x160
and pinctrl-amd makes gc visible through gpiochip_add_data_with_key()
Thanks,
Shreeya Patel
This is one of the races we saw in gpiochip_add_irqchip, depending on
the probe order. The gc is already visible while partially initialized,
if pinctrl-amd hasn't been probed yet. Another device being probed can
hit an -ENXIO here if to_irq is yet uninitialized or enter .to_irq() and
oops. Shreeya's patch workarounds the first issue, but is not a
solution for the second.
There is another patch that has been flying around to address the Oops.
https://lkml.org/lkml/2021/11/8/900
She's been working on a proper solution for that one, which might
actually address this too and replace the current patch. Maybe you
could help us get to a proper solution there? I'm quite unfamiliar with
this code myself :)