On Wed, Dec 20, 2023 at 1:23 AM Wolfram Sang <wsa@xxxxxxxxxx> wrote: > > On Thu, Dec 07, 2023 at 04:21:59PM +0800, Jensen Huang wrote: > > Possible deadlock scenario (on reboot): > > rk3x_i2c_xfer_common(polling) > > -> rk3x_i2c_wait_xfer_poll() > > -> rk3x_i2c_irq(0, i2c); > > --> spin_lock(&i2c->lock); > > ... > > <rk3x i2c interrupt> > > -> rk3x_i2c_irq(0, i2c); > > --> spin_lock(&i2c->lock); (deadlock here) > > > > Store the IRQ number and disable/enable it around the polling transfer. > > This patch has been tested on NanoPC-T4. > > > > Signed-off-by: Jensen Huang <jensenhuang@xxxxxxxxxxxxxxx> > > Reviewed-by: Heiko Stuebner <heiko@xxxxxxxxx> > > Reviewed-by: Andi Shyti <andi.shyti@xxxxxxxxxx> > > Applied to for-current, thanks! > > But I'd like to see the follow-up patches somewhen which have been > discussed in this thread. I've made some attempts, such as removing irqsave/restore, but it didn't meet my expectations and requires further testing. Therefore, I may not be able to submit such a patch in the short term. However, if someone else submits a new patch, I'd be happy to test it on rk3328/rk3399. On Fri, Dec 8, 2023 at 8:17 PM Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> wrote: > > On 12/8/23 11:53, Jensen Huang wrote: > > On Fri, Dec 8, 2023 at 12:00 AM Dmitry Osipenko > > <dmitry.osipenko@xxxxxxxxxxxxx> wrote: > >> > >> On 12/7/23 17:10, Dragan Simic wrote: > >>> On 2023-12-07 10:25, Jensen Huang wrote: > >>>> On Thu, Dec 7, 2023 at 4:37 PM Dragan Simic <dsimic@xxxxxxxxxxx> wrote: > >>>>> > >>>>> On 2023-12-07 09:21, Jensen Huang wrote: > >>>>>> Possible deadlock scenario (on reboot): > >>>>>> rk3x_i2c_xfer_common(polling) > >>>>>> -> rk3x_i2c_wait_xfer_poll() > >>>>>> -> rk3x_i2c_irq(0, i2c); > >>>>>> --> spin_lock(&i2c->lock); > >>>>>> ... > >>>>>> <rk3x i2c interrupt> > >>>>>> -> rk3x_i2c_irq(0, i2c); > >>>>>> --> spin_lock(&i2c->lock); (deadlock here) > >>>>>> > >>>>>> Store the IRQ number and disable/enable it around the polling > >>>>> transfer. > >>>>>> This patch has been tested on NanoPC-T4. > >>>>> > >>>>> In case you haven't already seen the related discussion linked below, > >>>>> please have a look. I also added more people to the list of recipients, > >>>>> in an attempt to make everyone aware of the different approaches to > >>>>> solving this issue. > >>>>> > >>>>> https://lore.kernel.org/all/655177f4.050a0220.d85c9.3ba0@xxxxxxxxxxxxx/T/#m6fc9c214452fec6681843e7f455978c35c6f6c8b > >>>> > >>>> Thank you for providing the information. I hadn't seen this link before. > >>>> After carefully looking into the related discussion, it appears that > >>>> Dmitry Osipenko is already working on a suitable patch. To avoid > >>>> duplication > >>>> or conflicts, my patch can be discarded. > >>> > >>> Thank you for responding so quickly. Perhaps it would be best to hear > >>> from Dmitry as well, before discarding anything. It's been a while > >>> since Dmitry wrote about working on the patch, so he might have > >>> abandoned it. > >> > >> This patch is okay. In general, will be better to have IRQ disabled by > >> default like I did in my variant, it should allow to remove the spinlock > >> entirely. Of course this also can be done later on in a follow up > >> patches. Jensen, feel free to use my variant of the patch, add my > >> s-o-b+co-developed tags to the commit msg if you'll do. Otherwise I'll > >> be able to send my patch next week. > > > > Thank you for the suggestion. I've updated the patch to your variant, and > > as confirmed by others, reboots are functioning correctly. I measured the > > overhead of enable_irq/disable_irq() by calculating ktime in the > > updated version, > > and on rk3399, the minimum delta I observed was 291/875 ns. This extra > > cost may impact most interrupt-based transfers. Therefore, I personally lean > > towards the current v2 patch and handle the spinlock and irqsave/restore in > > a follow up patch. I'd like to hear everyone's thoughts on this. > > Please don't use ktime for perf measurements, ktime itself is a slow API > and it should be 200us that ktime takes itself. Also, the 0.2us is > practically nothing, especially compared to I2C transfers measured in ms. > > I'm fine with keeping your v2 variant for the bug fix if you prefer > that. Thanks for addressing the issue :) Thank you for clarifying the situation with ktime. Honestly, I also believe that the overhead of disable_irq/enable_irq should be minimal. Considering that system frequency adjustments involve I2C transfers, I conducted this 'somewhat imprecise' measurements. -- Best regards, Jensen