On Wed, Oct 30, 2019 at 10:31 AM Hans de Goede <hdegoede@xxxxxxxxxx> wrote: > The problem here is that gpiochip_add_data_with_key() calls gpiochip_irqchip_init_hw() > before it calls gpiochip_irqchip_init_valid_mask(), so after commit 88583e340a0e > when byt_gpio_irq_init_hw runs gc->irq.valid_mask is NULL and we crash with a NULL > pointer exception (or so I believe, the kernel never gets far enough to get > any info out of it without extra work). > > Note that this ("[PATCH v1] pinctrl: baytrail: Move IRQ valid mask initialization to a dedicated callback") > patch fixes this since it moves the gc->irq.valid_mask accesses to > byt_init_irq_valid_mask. OK so we have a halfway fix there. > But this change itself triggers another variant of this ordering issue, > it causes these 2 new errors to get logged: > > byt_gpio INT33FC:01: GPIO interrupt error, pins misconfigured. INT_STAT0: 0x01800000 > byt_gpio INT33FC:02: GPIO interrupt error, pins misconfigured. INT_STAT0: 0x00400000 > > The problem is that before this change the code calculating the valid_mask > would also disable interrupts on GPIOs which do not have their > BYT_DIRECT_IRQ_EN bit set. This now happens after the check done in > byt_gpio_irq_init_hw() causing these false-positive error messages. Isn't that easily fixed with something like this: diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 9afbc0612126..e865c889ba8d 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1411,11 +1411,11 @@ int gpiochip_add_data_with_key(struct gpio_chip *chip, void *data, machine_gpiochip_add(chip); - ret = gpiochip_irqchip_init_hw(chip); + ret = gpiochip_irqchip_init_valid_mask(chip); if (ret) goto err_remove_acpi_chip; - ret = gpiochip_irqchip_init_valid_mask(chip); + ret = gpiochip_irqchip_init_hw(chip); if (ret) goto err_remove_acpi_chip; (I sent a separate patch for this.) It isn't super-easy to know the right ordering semantics for init_hw vs init_valid_mask I think. Sadly we need to test it out in practice. > Even if we ignore the NULL pointer deref problem for now and we ignore > these 2 new error messages for now. Things are still broken with the > current changes in pinctrl/intel.git/for-next switching to letting > devm_gpiochip_add_data register the irqchip means that > acpi_gpiochip_request_interrupts() gets called before > gpiochip_add_pin_range() is called from pinctrl-baytrail.c, causing > the GPIO lookup of any ACPI _AEI handlers to fail, resulting in > errors like this one: > > byt_gpio INT33FC:02: Failed to request GPIO for pin 0x13: -517 > > And none of the _AEI handlers working I just vaguely understand this... If what you're saying is that the Baytrail driver is dependent on registering the pin ranges *before* registering the GPIO chip can we then: diff --git a/drivers/pinctrl/intel/pinctrl-baytrail.c b/drivers/pinctrl/intel/pinctrl-baytrail.c index beb26550c25f..b308567c5153 100644 --- a/drivers/pinctrl/intel/pinctrl-baytrail.c +++ b/drivers/pinctrl/intel/pinctrl-baytrail.c @@ -1549,16 +1549,20 @@ static int byt_gpio_probe(struct byt_gpio *vg) girq->handler = handle_bad_irq; } - ret = devm_gpiochip_add_data(&vg->pdev->dev, gc, vg); + /* + * Needs to happen first since the gpiochip is using pin + * control as back-end. + */ + ret = gpiochip_add_pin_range(gc, dev_name(&vg->pdev->dev), + 0, 0, vg->soc_data->npins); if (ret) { - dev_err(&vg->pdev->dev, "failed adding byt-gpio chip\n"); + dev_err(&vg->pdev->dev, "failed to add GPIO pin range\n"); return ret; } - ret = gpiochip_add_pin_range(&vg->chip, dev_name(&vg->pdev->dev), - 0, 0, vg->soc_data->npins); + ret = devm_gpiochip_add_data(&vg->pdev->dev, gc, vg); if (ret) { - dev_err(&vg->pdev->dev, "failed to add GPIO pin range\n"); + dev_err(&vg->pdev->dev, "failed adding byt-gpio chip\n"); return ret; } (Tell me if I should send this as a separate patch.) It's not entirely logical to have this semantic ordering so the extra comment explains it, I hope, in case it actually works. > TL;DR: commit 88583e340a0e ("pinctrl: intel: baytrail: Pass irqchip when adding gpiochip") > breaks a bunch of stuff and should be dropped from pinctrl/intel.git/for-next > and this needs some more work before it is ready for mainline. I don't know if that is such a good idea if this is a global problem, like something that would potentially disturb any ACPI-based GPIO chip. We might leave something else broken even if we fix the issue locally. Yours, Linus Walleij