Hi Christoph, On Tue, Jul 19, 2022 at 7:07 AM Christoph Hellwig <hch@xxxxxx> wrote: > > Use filemap_fdatawrite_wbc instead of generic_writepages in > gfs2_ail1_start_one so that the functin can also cope with address_space > operations that only implement ->writepages and to properly account > for cgroup writeback. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > Reviewed-by: Andreas Gruenbacher <agruenba@xxxxxxxxxx> > --- > fs/gfs2/log.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c > index f0ee3ff6f9a87..a66e3b1f6d178 100644 > --- a/fs/gfs2/log.c > +++ b/fs/gfs2/log.c > @@ -131,7 +131,7 @@ __acquires(&sdp->sd_ail_lock) > if (!mapping) > continue; > spin_unlock(&sdp->sd_ail_lock); > - ret = generic_writepages(mapping, wbc); > + ret = filemap_fdatawrite_wbc(mapping, wbc); this patch unfortunately breaks journaled data inodes. We're in function gfs2_ail1_start_one() here, which is usually called via gfs2_log_flush() -> empty_ail1_list() -> gfs2_ail1_start() -> gfs2_ail1_flush() -> gfs2_ail1_start_one(), and we're going through the list of buffer heads in the transaction to be flushed. We used to submit each dirty buffer for I/O individually, but since commit 5ac048bb7ea6 ("GFS2: Use filemap_fdatawrite() to write back the AIL"), we're submitting all the dirty pages in the metadata address space this buffer head belongs to. That's slightly bizarre, but it happens to catch exactly the same buffer heads that are in the transaction, so we end up with the same result. From what I'm being told, this was done as a performance optimization -- but nobody actually knows the details anymore. The above change means that instead of calling generic_writepages(), we end up calling filemap_fdatawrite_wbc() -> do_writepages() -> mapping->a_ops->writepages(). But that's something completely different; the writepages address space operation operates is outward facing, while we really only want to write out the dirty buffers / pages in the underlying address space. In case of journaled data inodes, gfs2_jdata_writepages() actually ends up trying to create a filesystem transaction, which immediately hangs because we're in the middle of a log flush. So I'm tempted to revert the following two of your commits; luckily that's independent from the iomap_writepage() removal: d3d71901b1ea ("gfs2: remove ->writepage") b2b0a5e97855 ("gfs2: stop using generic_writepages in gfs2_ail1_start_one") I think we could go through iomap_writepages() instead of generic_writepages() here as well, but that's for another day. As far as cgroup writeback goes, this is journal I/O and I don't have the faintest idea how the accounting for that is even supposed to work. > if (need_resched()) { > blk_finish_plug(plug); > cond_resched(); > @@ -222,8 +222,7 @@ void gfs2_ail1_flush(struct gfs2_sbd *sdp, struct writeback_control *wbc) > spin_unlock(&sdp->sd_ail_lock); > blk_finish_plug(&plug); > if (ret) { > - gfs2_lm(sdp, "gfs2_ail1_start_one (generic_writepages) " > - "returned: %d\n", ret); > + gfs2_lm(sdp, "gfs2_ail1_start_one returned: %d\n", ret); > gfs2_withdraw(sdp); > } > trace_gfs2_ail_flush(sdp, wbc, 0); > -- > 2.30.2 > Thanks, Andreas