On 3/28/24 13:30, Christoph Hellwig wrote: > I think this should go into the previous patch, splitting it > out just causes confusion. > >> +static void disk_free_zone_wplug(struct blk_zone_wplug *zwplug) >> +{ >> + struct gendisk *disk = zwplug->disk; >> + unsigned long flags; >> + >> + if (zwplug->flags & BLK_ZONE_WPLUG_NEEDS_FREE) { >> + kfree(zwplug); >> + return; >> + } >> + >> + spin_lock_irqsave(&disk->zone_wplugs_lock, flags); >> + list_add_tail(&zwplug->link, &disk->zone_wplugs_free_list); >> + spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); >> +} >> + >> static bool disk_insert_zone_wplug(struct gendisk *disk, >> struct blk_zone_wplug *zwplug) >> { >> @@ -630,18 +665,24 @@ static struct blk_zone_wplug *disk_get_zone_wplug(struct gendisk *disk, >> return zwplug; >> } >> >> +static void disk_free_zone_wplug_rcu(struct rcu_head *rcu_head) >> +{ >> + struct blk_zone_wplug *zwplug = >> + container_of(rcu_head, struct blk_zone_wplug, rcu_head); >> + >> + disk_free_zone_wplug(zwplug); >> +} > > Please verify my idea carefully, but I think we can do without the > RCU grace period and thus the rcu_head in struct blk_zone_wplug: > > When the zwplug is removed from the hash, we set the > BLK_ZONE_WPLUG_UNHASHED flag under disk->zone_wplugs_lock. Once > caller see that flag any lookup that modifies the structure > will fail/wait. If we then just clear BLK_ZONE_WPLUG_UNHASHED after > the final put in disk_put_zone_wplug when we know the bio list is > empty and no other state is kept (if there might be flags left > we should clear them before), it is perfectly fine for the > zwplug to get reused for another zone at this point. That was my thinking initially as well, which is why I did not have the grace period. However, getting a reference on a plug is a not done under disk->zone_wplugs_lock and is thus racy, albeit with a super tiny time window: the hash table lookup may "see" a plug that has already been removed and has a refcount dropped to 0 already. The use of atomic_inc_not_zero() prevents us from trying to keep using that stale plug, but we *are* referencing it. So without the grace period, I think there is a risk (again, super tiny window) that we start reusing the plug, or kfree it while atomic_inc_not_zero() is executing... I am overthinking this ? -- Damien Le Moal Western Digital Research