On 5/1/24 08:06, Damien Le Moal wrote: > On 5/1/24 00:34, Christoph Hellwig wrote: >>> Fix this by modifying disk_should_remove_zone_wplug() to check that the >>> reference count to a zone write plug is not larger than 2, that is, that >>> the only references left on the zone are the caller held reference >>> (blk_zone_write_plug_complete_request()) and the initial extra reference >>> for the zone write plug taken when it was initialized (and that is >>> dropped when the zone write plug is removed from the hash table). >> >> How is this atomic_read() based check not racy? > > Because of how references work: > 1) A valid and unused zone write plug has a ref count of 1 > 2) A function using a write plug always has a reference on it, so if the plug is > valid and unused, the ref count is always 2 > 3) Any plugged BIO and in-flight BIOs and requests hold a reference on the plug. > So if the plug is used for BIOs, the reference count is always at least 2, and > when a function is using the plug the refcount is always at least 3 > > Based on this, all callers of disk_should_remove_zone_wplug() will always see a > refcount of 2 if the plug is unused, or more than 2 if the plug is being used to > handle BIOs. Most of the time, checking for the BUSY (PLUGGED) flag catches the > later case. But as explained in the commit message, chained BIOs due to splits > can lead to bio_endio() execution order to change and to calls to > blk_zone_write_plug_bio_endio() to be done after > blk_zone_write_plug_finish_request() calls disk_zone_wplug_unplug_bio(). > Checking that the plug refcount is not more than 2 tells us reliably that BIOs > are still holding references on the plug and that the plug should not be removed > until all BIOs completions are handled. > > Does this answer your question ? I modified the patch to add a comment in disk_should_remove_zone_wplug() explaining the above. -- Damien Le Moal Western Digital Research