I added some tracing to fs/ceph/addr.c and this highlights the bug causing the hang that I'm seeing. So what I see is ceph_writepages_start() being entered and getting a collection of folios from filemap_get_folios_tag(): netfs_ceph_writepages: i=10000004f52 ix=0 netfs_ceph_wp_get_folios: i=10000004f52 oix=0 ix=8000000000000 nr=6 Then we get out the first dirty folio from the batch and attempt to lock it: netfs_folio: i=10000004f52 ix=00003-00003 ceph-wb-lock which succeeds. We then pass through a number of lines: netfs_ceph_wp_track: i=10000004f52 line=1218 which is the "/* shift unused page to beginning of fbatch */" comment, then: netfs_ceph_wp_track: i=10000004f52 line=1238 which is followed by "offset = ceph_fscrypt_page_offset(pages[0]);", then: netfs_ceph_wp_track: i=10000004f52 line=1264 which is the error handling path of: if (!ceph_inc_osd_stopping_blocker(fsc->mdsc)) { rc = -EIO; goto release_folios; } and then: netfs_ceph_wp_track: i=10000004f52 line=1389 which is "release_folios:". We then reenter ceph_writepages_start(), get the same batch of dirty folios and try to lock them again: netfs_ceph_writepages: i=10000004f52 ix=0 netfs_ceph_wp_get_folios: i=10000004f52 oix=0 ix=8000000000000 nr=6 netfs_folio: i=10000004f52 ix=00003-00003 ceph-wb-lock and that's where we hang. I think the problem is that the error handling here: if (!ceph_inc_osd_stopping_blocker(fsc->mdsc)) { rc = -EIO; goto release_folios; } is insufficient. The folios are locked and can't just be released. Why ceph_inc_osd_stopping_blocker() fails is also something that needs looking at. David