Hi Everyone, On Tue, Apr 21, 2020 at 10:21 AM Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote: > > On Tue, Apr 21, 2020 at 04:06:24PM +0200, Uwe Kleine-König wrote: > > On Tue, Apr 21, 2020 at 03:55:53PM +0300, Andy Shevchenko wrote: > > > On Mon, Apr 20, 2020 at 11:23:57PM -0400, Paul Thomas wrote: > > > > On Mon, Apr 20, 2020 at 1:27 PM Andy Shevchenko > > > > <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > The commit 96d7c7b3e654 ("gpio: gpio-pca953x, Add get_multiple function") > > > > > basically did everything wrong from style and code reuse perspective, i.e. > > > > Hi Andy, > > > > > > > > Well your version is certainly elegant and simple, and does better > > > > with code reuse. However there are a couple of other goals I had in > > > > mind. > > > > First, the "lazy" approach of 96d7c7b3e654 is actually faster when > > > > user space sets up a 8-bit linehandle[1]146us (single regmap_read()) > > > > vs 172us (pca953x_read_regs()) which incidentally is what we do in our > > > > application. In lazily reading 1 byte at a time it is the fastest > > > > access for that, if user space is always setting up the linehandle for > > > > the whole chip pca953x_read_regs() would be faster. Seeing as > > > > get_multiple has been unimplemented for this chip until now perhaps > > > > our use case deserves some consideration? > > > > > > I understand completely your goal, but > > > - for I²C expanders timings is the last thing to look for (they are quite slow > > > by nature), so, I really don't care about 16% speed up for one call; don't > > > forget that we are in multi-task OS, where this can be easily interrupted and > > > user will see the result quite after expected quick result Sure, it's not a HUGE deal, but this will get worse for 5 bank chips. Also, 26us is not insignificant, with the preempt-rt kernel we're using that can be more than the max interrupt latency. > > > > I didn't do any timing analysis and while I understand your > > argumentation, I'm not sure to agree. I noticed while debugging the > > problem that then resulted in my fix that gpioctl uses the .set_multiple > > callback. I told my customer to use gpioctl instead of /sys/class/gpio > > because it performs better just to notice that when using gpioctl to set > > a single GPIO this transfers five bytes instead of only two. > > > > Having said that I think the sane approach is to optimize > > .{g,s}et_multiple to reduce the read/write size to the smallest bulk > > size possible that covers all bits that have their corresponding bit set > > in mask. > > > > > - the code maintenance has a priority over micro-optimization (this driver > > > suffered many times of breakages because of some optimizations done) Yeah, I appreciate that maintainability needs to be a big priority, I'm wondering if comments & syntax could be improved so that the general idea of 96d7c7b3e654 is clear and maintainable. It is just walking through mask, and whenever it gets to a new byte it reads it from the hardware. > > > > ack here. Some parts of the driver were harder to grasp than necessary. > > > > > - it breaks Uwe's approach to fix AI chips, after my patch Uwe's ones are > > > applied cleanly > > > > I didn't check, is 96d7c7b3e654 broken for some chips? > > I was referring to another call to recalc address with additional parameters, > which your second patch actually gets rid of. If it's just the call to pca953x_recalc_addr() that caused the conflict with Uwe's work with 96d7c7b3e654, we can just remove the last two arguments so it matches what pca953x_gpio_get_value() is doing. thanks, Paul