On Fri, Dec 15, 2023 at 02:49:55AM -0800, Christoph Hellwig wrote: > On Fri, Dec 15, 2023 at 05:43:53AM +0000, Matthew Wilcox wrote: > > block_write_full_page() handles this fine, so why isn't this > > > > if (buffer_locked(bh)) > > goto confused; > > > > Is the thought that we hold the folio locked, therefore no buffers > > should be locked at this time? I don't know the rules. > > That would be my guess. For writeback on the fs mapping no one > else should ever have locked the buffer. For the bdev mapping > buffer locking isn't completely controlled by the bdevfs, but also > by any mounted fs using the buffer cache. So from my POV your > above change should be ok, but it'll need a big fat comment so > that the next person seeing it in 20 years isn't as confused as > you are now :) Thanks! Now I'm thinking that it's not OK to call mpage_writepages(). __block_write_full_folio() takes the buffer lock and holds it across the I/O (it seems that we never split the buffer lock into buffer writeback like we did the page lock?) So filesystems may have an expectation that locking the buffer synchronises with writeback. Perhaps it's safest to just call block_write_full_page() in a loop for blockdev? I wrote up this before realising that I'd misread the code; adding it here for posterity. For filesystems, we know that the filesystem will not lock the buffer since the writeback path holds the folio locked at this point. For block devices, the filesystem mounted on it may lock the buffer without locking the folio first (observed with both ext2 and ext4). If we find the buffer locked, just fall back to block_write_full_page() which handles potentially locked buffers correctly. It's fine if we win the race and the filesystem locks the buffer after this point; block_write_full_page() doesn't hold the buffer locked across the I/O, so the filesystem has no expectations that locking the buffer will wait for data writeback.