On 29.06.21 г. 16:59, Josef Bacik wrote: > I've been debugging an early ENOSPC problem in production and finally > root caused it to this problem. When we switched to the per-inode in > 38d715f494f2 ("btrfs: use btrfs_start_delalloc_roots in > shrink_delalloc") I pulled out the async extent handling, because we > were doing the correct thing by calling filemap_flush() if we had async > extents set. This would properly wait on any async extents by locking > the page in the second flush, thus making sure our ordered extents were > properly set up. > > However when I switched us back to page based flushing, I used > sync_inode(), which allows us to pass in our own wbc. The problem here > is that sync_inode() is smarter than the filemap_* helpers, it tries to > avoid calling writepages at all. This means that our second call could > skip calling do_writepages altogether, and thus not wait on the pagelock > for the async helpers. This means we could come back before any ordered > extents were created and then simply continue on in our flushing > mechanisms and ENOSPC out when we have plenty of space to use. > > Fix this by putting back the async pages logic in shrink_delalloc. This > allows us to bulk write out everything that we need to, and then we can > wait in one place for the async helpers to catch up, and then wait on > any ordered extents that are created. > > Fixes: e076ab2a2ca7 ("btrfs: shrink delalloc pages instead of full inodes") > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx> This patch really depend on the next one in order for it to be correct. Imo this dependency should be explicitly stated in the change log and the patches re-ordered.