On 10/3/24 9:49 AM, Waiman Long wrote: > On 10/3/24 10:38, Jens Axboe wrote: >> On 10/3/24 8:31 AM, Dan Carpenter wrote: >>> On Thu, Oct 03, 2024 at 07:21:25AM -0600, Jens Axboe wrote: >>>> On 10/3/24 6:03 AM, Dan Carpenter wrote: >>>>> 3117 ioc_now(iocg->ioc, &now); >>>>> 3118 weight_updated(iocg, &now); >>>>> 3119 spin_unlock(&iocg->ioc->lock); >>>>> 3120 } >>>>> 3121 } >>>>> 3122 spin_unlock_irq(&blkcg->lock); >>>>> 3123 >>>>> 3124 return nbytes; >>>>> 3125 } >>>>> 3126 >>>>> 3127 blkg_conf_init(&ctx, buf); >>>>> 3128 >>>>> 3129 ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, &ctx); >>>>> 3130 if (ret) >>>>> 3131 goto err; >>>>> 3132 >>>>> 3133 iocg = blkg_to_iocg(ctx.blkg); >>>>> 3134 >>>>> 3135 if (!strncmp(ctx.body, "default", 7)) { >>>>> 3136 v = 0; >>>>> 3137 } else { >>>>> 3138 if (!sscanf(ctx.body, "%u", &v)) >>>>> 3139 goto einval; >>>>> 3140 if (v < CGROUP_WEIGHT_MIN || v > CGROUP_WEIGHT_MAX) >>>>> 3141 goto einval; >>>>> 3142 } >>>>> 3143 >>>>> 3144 spin_lock(&iocg->ioc->lock); >>>>> >>>>> But why is this not spin_lock_irq()? I haven't analyzed this so maybe it's >>>>> fine. >>>> That's a bug. >>>> >>> I could obviously write this patch but I feel stupid writing the >>> commit message. My level of understanding is Monkey See Monkey do. >>> Could you take care of this? >> Sure - or let's add Tejun who knows this code better. Ah he's already >> added. Tejun? >> >>> So somewhere we're taking a lock in the IRQ handler and this can lead >>> to a deadlock? I thought this would have been caught by lockdep? >> It's nested inside blkcg->lock which is IRQ safe, that is enough. But >> doing a quick scan of the file, the usage is definitely (widly) >> inconsistent. Most times ioc->lock is grabbed disabling interrupts, but >> there are also uses that doesn't disable interrupts, coming from things >> like seq_file show paths which certainly look like they need it. lockdep >> should certainly warn about this, only explanation I have is that nobody >> bothered to do that :-) > > The lockdep validator will only warn about this if a debug kernel with > lockdep enabled has run a workload that exercises all the relevant > locking sequences that can implicate a potential for deadlock. Sure that's obvious, but there are quite a few easy ones in there, so seems like it should be easy to trigger. It's not like it's only some odd path, the irq on/off looks trivial. -- Jens Axboe