Hi, [SPINLOCK AND END WRITEBACK] Although folio_end_writeback() can be called in an interruption context (by the in_task() check), surprisingly it may not be suitable to be called inside a spinlock (in task context): For example the following call chain can lead to sleep: spin_lock() folio_end_writeback() |- folio_end_dropbehind_write() |- folio_launder() Which can sleep. My question is, can and should we allow folio_end_writeback() inside a spinlock? [BTRFS' ASYNC EXTENT PROBLEM] This is again a btrfs specific behavior, that we have to call folio_end_writeback() inside a spinlock to make sure really only one thread can clear the writeback flag of a folio. I know iomap is doing a pretty good job preventing early finished writeback to clear the folio writeback flag, meanwhile we're still submitting other blocks inside the folio. Iomap goes holding one extra writeback count before starting marking blocks writeback and submitting them. And after all blocks are submitted, reduce the writeback count (and call folio_end_writeback() if it's the last one holding the writeback flag). But the iomap solution requires that, all blocks inside a folio to be submitted at the same time. This is not true in btrfs, due to the design of btrfs' async extent, which can mark the blocks for writeback really at any timing, as we keep the lock of folios and queue them into a workqueue to do compression, then only after the compression is done, we submit and mark them writeback and unlock. If we do not hold a spinlock calling folio_end_writeback(), but only checks if we're the last holder and call folio_end_writeback() out of the spinlock, we can hit the following race where folio_end_writeback() can be called on the same folio: 0 32K 64K |<- AE 1 ->|<- AE 2 ->| Thread A (AE 1) | Thread B (AE 2) --------------------------------------+------------------------------ submit_one_async_extent() | |- process_one_folio() | |- subpage_set_writeback() | | /* IO finished */ | end_compressed_writeback() | |- btrfs_folio_clear_writeback() | |- spin_lock() | | /* this is the last writeback | | holder, should end the | | folio writeback flag */ | |- last = true | |- spin_unlock() | | | submit_one_async_extent() | | |- process_one_folio() | | |- subpage_set_writeback() | | | | /* IO finished */ | | end_compressed_writeback() | | |btrfs_folio_clear_writeback() | | | /* Again the last holder */ | | |- spin_lock() | | |- last = true | | |- spin_unlock() |- folio_end_writeback() | |- folio_end_writeback() I know the most proper solution would to get rid of the delayed unlock and submission, but mark blocks for writeback without the async extent mechanism completely, then follow the iomap's solution. But that will be a huge change (although we will go that path eventually), meanwhile the folio_end_writeback() inside spinlock needs to be fixed asap. So my question is, can we allow folio_end_writeback() to be called inside a spinlock, at the cost of screwing up the folio reclaim for DONTCACHE? Thanks, Qu