On Fri, Oct 18, 2024 at 09:57:12AM -0700, Bart Van Assche wrote: > Thank you for having reported this issue and for having proposed a > patch. I think the following is missing from the patch description: > (a) An analysis of which code causes the inconsistent nested lock order. > (b) A discussion of potential alternatives. > > It seems unavoidable to me that some code freezes request queue(s) > before the limits are updated. Additionally, there is code that freezes > queues first (sd_revalidate_disk()), executes commands and next updates > limits. Hence the inconsistent order of freezing queues and obtaining > limits_lock. > > The alternative (entirely untested) solution below has the following > advantages: > * No additional information has to be provided to lockdep about the > locking order. > * No new flags are required (SKIP_FREEZE_LOCKDEP). > * No exceptions are necessary for blk_queue_exit() nor for the NVMe > driver. As mentioned by Ming fixing the lockdep reports is a bit different from adding it first. But I'll chime in for your proposal to fix the report anyway. I think requiring the queue to be frozen before we start the queue limits update is reasonable in general, but a major pain for sd_revalidate_disk, and unfortunately I don't think your proposed patch works there as it doesn't protect against concurrent updates from say sysfs, and also doesn't scale to adding new limits. One option to deal with this would be to add an update sequence to the queue_limits. sd_revalidate_disk would sample it where it currently starts the update, then drop the lock and query the device, then takes the limits lock again, checks for the sequence. If it matches it commits the update, else it retries.