On Sat, 20 Jul 2019, at 17:43, Linus Walleij wrote: > Hi Hongwei, > > after looking close at the driver and bindings I have this feeback: > > On Fri, Jul 19, 2019 at 9:25 PM Hongwei Zhang <hongweiz@xxxxxxx> wrote: > > +- reg : Address and length of the register set for the device > > This 0x100 range may look simple but in the driver it looks like > this: > > +static const struct aspeed_sgpio_bank aspeed_sgpio_banks[] = { > + { > + .val_regs = 0x0000, > + .rdata_reg = 0x0070, > + .irq_regs = 0x0004, > + .names = { "A", "B", "C", "D" }, > + }, > + { > + .val_regs = 0x001C, > + .rdata_reg = 0x0074, > + .irq_regs = 0x0020, > + .names = { "E", "F", "G", "H" }, > + }, > + { > + .val_regs = 0x0038, > + .rdata_reg = 0x0078, > + .irq_regs = 0x003C, > + .names = { "I", "J" }, > + }, > +}; > > So first break this into up to 10 different instances with one device > per bank instead. > > For each device: > > reg = <0x1e780200 4>, <x1e780204 ..>, <0x1e780270 ..>; > reg-names = "val", "irq", "rdata"; > > This way you use the device tree features for locating the > register offsets instead of hard-coding it into the driver, > which will make the driver simpler. > > > +- ngpios : number of GPIO pins to serialise. > > + (should be multiple of 8, up to 80 pins) > > This is wrong. This should be split into 10 different devices providing > 8 GPIO lines each. The "ngpios" is not intended to subdivide > banks, but for the case where you use say bits 0..11 of 32 bits in > a register by setting ngpios to 12. > > I see that they use the same clock divider and the same interrupt, > but this is better to partition into a separate clock divider driver > and up to 10 instances of GPIO devices with 8 GPIOs each. > > I also see that they use the same interrupt. This is fine, in your > driver just grab a shared IRQ and all the IRQ handlers will be > traversed. This is a well known pattern for all operating systems. > > > +- clocks : A phandle to the APB clock for SGPM clock division > > + > > +- bus-frequency : SGPM CLK frequency > > I see that there is a common clock control register in offset 0x54. > > Split this out to a separate clock driver that provides a divided clock > that all GPIO blocks can get, no matter if you use 1, 2 .. 10 of these > blocks. > > The fact that these GPIO banks and the clock register is in the same > memory range does not really matter. Split up the memory range in > on reg = per GPIO chip and one reg for the clock. The issue that I see with this is that the SGPIO bus clock configuration is not the only field in the SGPIO control register - it also contains the number of GPIOs the bus should output along with a bus enable bit. The clock, and particularly the number of GPIOs (due to the nature of the SGPIO bus) both need to be configured before we set the enable bit (or obviously all can be done simultaneously). I'll explore some of the options below (it's a bit train-of-thought, hopefully it makes sense): If the clock driver owns the control register, it also needs to know how many GPIOs we want to emit on the bus. This seems like an awkward configuration parameter for a clock driver. To avoid the weird parameter we could protect the control register with a lock shared between the clock driver and the SGPIO driver. This way the SGPIO driver could have the ngpios parameter, and request the clock after its written the ngpios value to the control register. A regmap would be useful here to avoid the resource clash and it also provides the required lock. However, you're also going down the path of splitting the driver such that there's one instance per bank. With this approach we need to solve two problems: Accounting for the total number of GPIOs, and only enabling the bus after the last bank has had its driver probed. The accounting might be handled by accumulating the number of GPIOs in each bank in the control register itself, e.g. the driver implementation does something like: spin_lock(...) ctrl = ioread32(...) ngpios = FIELD_GET(ASPEED_SGPIO_CTRL_NGPIOS, ctrl); ngpios += 8; ctrl &= ~ASPEED_SGPIO_CTRL_NGPIOS; ctrl |= FIELD_PREP(ASPEED_SGPIO_CTRL_NGPIOS, ngpios); iowrite32(ctrl, ...); spin_unlock(...); This works on cold boot of the device when the ngpios field is set to zero due to reset, however will fail on subsequent warm reboots if the GPIO IP block is protected from reset by the SoC's watchdog configuration: the field will not be zeroed in this case, and the value of the field is represented by `NR_BOOTS * NR_GPIOS`, which is incorrect. To do this correctly I guess we would need some other global state held in the driver implementation (zeroed when the kernel is loaded), and write the incremented value to the control register on each probe() invocation. However, that aside, we can't simply enable the bus in the clock enable callback if enable is called per-bank, as it is called once on the first request with further requests simply refcounted as you mentioned. This is exactly the behaviour we can't tolerate with the bus: it must only be enabled after the last GPIO bank is registered, when we know the total number of GPIOs to emit. So we can only call clk_prepare_enable() after the last bank is probed, or in the probe() of the last bank if the driver has some way to tell it is the last instance. In my mind per-bank clock management makes the implementation more involved than it should be, and isn't really reflective of what the clk subsystem is trying to achieve (violates clk_prepare()/clk_enable() post conditions, or boils down to just calling devm_clk_get() in probe() with no further action, which seems kinda pointless). Having a separate clk implementation whose instance is requested once by a monolithic SGPIO driver matches how the hardware is designed. However, given the issue of sharing the register with the GPIO count, in my opinion it seems to make more sense to leave the driver design as is and minimise the accidental complexity introduced by the behavioural mismatch of the clk abstraction and the addition of locking where it wasn't previously necessary. Interested in your thoughts. Andrew