On Fri, Nov 19, 2010 at 01:45:52AM +0100, Jan Kara wrote: > On Wed 17-11-10 22:28:34, Andrew Morton wrote: > > The fact that a call to ->write_begin can randomly return with s_umount > > held, to be randomly released at some random time in the future is a > > bit ugly, isn't it? write_begin is a pretty low-level, per-inode > > thing. > I guess you missed that writeback_inodes_sb_nr() (called from _if_idle > variants) does: > bdi_queue_work(sb->s_bdi, &work); > wait_for_completion(&done); > So we return only after all the IO has been submitted and unlock s_umount > in writeback_inodes_sb_if_idle(). And we cannot really submit the IO ourselves > because we are holding i_mutex and we need to get and put references > to other inodes while doing writeback (those would be really horrible lock > dependencies - writeback thread can put the last reference to an unlinked > inode...). But if we're waiting for it, with the lock held, then surely it can deadlock just the same as if we submit it ourself? BTW. are you taking i_mutex inside writeback? I mutex can be held while entering page reclaim, and thus writepage... so it could be a bug too. > In fact, as I'm speaking about it, pushing things to writeback thread and > waiting on the work does not help a bit with the locking issues (we didn't > wait for the work previously but that had other issues). Bug, sigh. > > What might be better interface for usecases like above is to allow > filesystem to kick flusher thread to start doing background writeback > (regardless of dirty limits). Then the caller can wait for some delayed > allocation reservations to get freed (easy enough to check in > ->writepage() and wake the waiters) - possibly with a reasonable timeout > so that we don't stall forever. We really need to throttle the producer without any locks held, no? So the filesystem would like to to hook into dirtying path somewhere without i_mutex held (and without implementing your own .write). Eg. around the dirty throttling calls somewhere I suppose. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html