On Thu, Mar 21, 2019 at 02:13:04PM +0100, Andreas Gruenbacher wrote: > Hi Christoph, > > we need your help fixing a gfs2 deadlock involving iomap. What's going > on is the following: > > * During iomap_file_buffered_write, gfs2_iomap_begin grabs the log flush > lock and keeps it until gfs2_iomap_end. It currently always does that > even though there is no point other than for journaled data writes. > > * iomap_file_buffered_write then calls balance_dirty_pages_ratelimited. > If that ends up calling gfs2_write_inode, gfs2 will try to grab the > log flush lock again and deadlock. What is the exactly call chain? balance_dirty_pages_ratelimited these days doesn't start I/O, but just wakes up the flusher threads. Or do we have a issue where it is blocking on those threads? Also why do you need to flush the log for background writeback in ->write_inode? balance_dirty_pages_ratelimited is per definition not a data integrity writeback, so there shouldn't be a good reason to flush the log (which I assume the log flush log is for). If we look gfs2_write_inode, this seems to be the code: bool flush_all = (wbc->sync_mode == WB_SYNC_ALL || gfs2_is_jdata(ip)); if (flush_all) gfs2_log_flush(GFS2_SB(inode), ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL | GFS2_LFC_WRITE_INODE); But what is the requirement to do this in writeback context? Can't we move it out into another context instead?