On 5/1/24 00:34, Christoph Hellwig wrote: >> Fix this by modifying disk_should_remove_zone_wplug() to check that the >> reference count to a zone write plug is not larger than 2, that is, that >> the only references left on the zone are the caller held reference >> (blk_zone_write_plug_complete_request()) and the initial extra reference >> for the zone write plug taken when it was initialized (and that is >> dropped when the zone write plug is removed from the hash table). > > How is this atomic_read() based check not racy? Because of how references work: 1) A valid and unused zone write plug has a ref count of 1 2) A function using a write plug always has a reference on it, so if the plug is valid and unused, the ref count is always 2 3) Any plugged BIO and in-flight BIOs and requests hold a reference on the plug. So if the plug is used for BIOs, the reference count is always at least 2, and when a function is using the plug the refcount is always at least 3 Based on this, all callers of disk_should_remove_zone_wplug() will always see a refcount of 2 if the plug is unused, or more than 2 if the plug is being used to handle BIOs. Most of the time, checking for the BUSY (PLUGGED) flag catches the later case. But as explained in the commit message, chained BIOs due to splits can lead to bio_endio() execution order to change and to calls to blk_zone_write_plug_bio_endio() to be done after blk_zone_write_plug_finish_request() calls disk_zone_wplug_unplug_bio(). Checking that the plug refcount is not more than 2 tells us reliably that BIOs are still holding references on the plug and that the plug should not be removed until all BIOs completions are handled. Does this answer your question ? -- Damien Le Moal Western Digital Research