Re: [PATCH v2] blk_iocost: remove some duplicate irq disable/enables

Jens Axboe <axboe@xxxxxxxxx> · Thu, 3 Oct 2024 10:24:57 -0600

On 10/3/24 9:49 AM, Waiman Long wrote:
> On 10/3/24 10:38, Jens Axboe wrote:
>> On 10/3/24 8:31 AM, Dan Carpenter wrote:
>>> On Thu, Oct 03, 2024 at 07:21:25AM -0600, Jens Axboe wrote:
>>>> On 10/3/24 6:03 AM, Dan Carpenter wrote:
>>>>>    3117                                  ioc_now(iocg->ioc, &now);
>>>>>    3118                                  weight_updated(iocg, &now);
>>>>>    3119                                  spin_unlock(&iocg->ioc->lock);
>>>>>    3120                          }
>>>>>    3121                  }
>>>>>    3122                  spin_unlock_irq(&blkcg->lock);
>>>>>    3123
>>>>>    3124                  return nbytes;
>>>>>    3125          }
>>>>>    3126
>>>>>    3127          blkg_conf_init(&ctx, buf);
>>>>>    3128
>>>>>    3129          ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, &ctx);
>>>>>    3130          if (ret)
>>>>>    3131                  goto err;
>>>>>    3132
>>>>>    3133          iocg = blkg_to_iocg(ctx.blkg);
>>>>>    3134
>>>>>    3135          if (!strncmp(ctx.body, "default", 7)) {
>>>>>    3136                  v = 0;
>>>>>    3137          } else {
>>>>>    3138                  if (!sscanf(ctx.body, "%u", &v))
>>>>>    3139                          goto einval;
>>>>>    3140                  if (v < CGROUP_WEIGHT_MIN || v > CGROUP_WEIGHT_MAX)
>>>>>    3141                          goto einval;
>>>>>    3142          }
>>>>>    3143
>>>>>    3144          spin_lock(&iocg->ioc->lock);
>>>>>
>>>>> But why is this not spin_lock_irq()?  I haven't analyzed this so maybe it's
>>>>> fine.
>>>> That's a bug.
>>>>
>>> I could obviously write this patch but I feel stupid writing the
>>> commit message. My level of understanding is Monkey See Monkey do.
>>> Could you take care of this?
>> Sure - or let's add Tejun who knows this code better. Ah he's already
>> added. Tejun?
>>
>>> So somewhere we're taking a lock in the IRQ handler and this can lead
>>> to a deadlock? I thought this would have been caught by lockdep?
>> It's nested inside blkcg->lock which is IRQ safe, that is enough. But
>> doing a quick scan of the file, the usage is definitely (widly)
>> inconsistent. Most times ioc->lock is grabbed disabling interrupts, but
>> there are also uses that doesn't disable interrupts, coming from things
>> like seq_file show paths which certainly look like they need it. lockdep
>> should certainly warn about this, only explanation I have is that nobody
>> bothered to do that :-)
> 
> The lockdep validator will only warn about this if a debug kernel with
> lockdep enabled has run a workload that exercises all the relevant
> locking sequences that can implicate a potential for deadlock.

Sure that's obvious, but there are quite a few easy ones in there, so
seems like it should be easy to trigger. It's not like it's only some
odd path, the irq on/off looks trivial.

-- 
Jens Axboe