On Thu, Aug 18, 2022 at 1:13 PM Linus Walleij <linus.walleij@xxxxxxxxxx> wrote: > > On Thu, Aug 18, 2022 at 11:48 AM Arnd Bergmann <arnd@xxxxxxxx> wrote: > > > As I understood, the problem that Christophe ran into is that the > > dynamic registration of additional gpio chips is broken because > > it unregisters the chip if the number space is exhausted: > > > > base = gpiochip_find_base(gc->ngpio); > > if (base < 0) { > > ret = base; > > spin_unlock_irqrestore(&gpio_lock, flags); > > goto err_free_label; > > } > > > > From the git history, it looks like this error was never handled gracefully > > even if the intention was to keep going without a number assignment, > > so there are probably other bugs one runs into after changing this. > > Hm that should be possible to get rid of altogether? I suppose it is only > there to satisfy > > static inline bool gpio_is_valid(int number) > { > return number >= 0 && number < ARCH_NR_GPIOS; > } > > ? > > If using GPIO descriptors, any descriptor != NULL is valid, > this one is just used with legacy GPIOs. Maybe we should just > delete gpio_is_valid() everywhere and then drop the cap? I think it makes sense to keep gpio_is_valid() for as long as we support the numbers. > I think there may be systems and users that still depend on GPIO base > numbers being assigned from ARCH_NR_GPIOS and > downwards (userspace GPIO numbers in sysfs will also change...) > otherwise we could assign from 0 and up. Is it possible to find in-kernel users that depend on well-known numbers for dynamically assigned gpios? I would argue that those are always broken. Even for the sysfs interface, it is questionable to rely on specific numbers because at least in an arm multiplatform kernel the top number changes based on kernel configuration. > Right now the safest would be: > Assign from 512 and downwards until we hit 0 then assign > from something high, like U32_MAX and downward. > > That requires dropping gpio_is_valid() everywhere. > > If we wanna be bold, just delete gpio_is_valid() and assign > bases from 0 and see what happens. But I think that will > lead to regressions. I'm still unsure how removing gpio_is_valid() would help. What I could imagine as a next step would be to mark all consumer drivers and the sysfs interface that use gpio numbers as 'depends on GPIO_LEGACY' and then only provide the corresponding drivers if that option is set. After that, we could try to make sure we can run the defconfigs for modern architectures (arm64, riscv, x86-64, ...) without the option by converting the drivers that are most commonly used. Arnd