On Fri, Mar 02, 2012 at 11:57:00AM -0800, Andrew Morton wrote: > On Fri, 2 Mar 2012 18:39:51 +0800 > Fengguang Wu <fengguang.wu@xxxxxxxxx> wrote: > > > > And I agree it's unlikely but given enough time and people, I > > > believe someone finds a way to (inadvertedly) trigger this. > > > > Right. The pageout works could add lots more iput() to the flusher > > and turn some hidden statistical impossible bugs into real ones. > > > > Fortunately the "flusher deadlocks itself" case is easy to detect and > > prevent as illustrated in another email. > > It would be a heck of a lot safer and saner to avoid the iput(). We > know how to do this, so why not do it? My concern about the page lock is, it costs more code and sounds like hacking around something. It seems we (including me) have been trying to shun away from the iput() problem. Since it's unlikely we are to get rid of the already existing iput() calls from the flusher context, why not face the problem, sort it out and use it with confident in new code? Let me try it now. The only scheme iput() can deadlock the flusher is for the iput() path to come back to queue some work and wait for it. Here are the exhaust list of the queue+wait paths: writeback_inodes_sb_nr_if_idle ext4_nonda_switch ext4_page_mkwrite # from page fault ext4_da_write_begin # from user writes writeback_inodes_sb_nr quotactl syscall # from syscall __sync_filesystem # from sync/umount shrink_liability # ubifs make_free_space ubifs_budget_space # from all over ubifs: 2 274 /c/linux/fs/ubifs/dir.c <<ubifs_create>> 3 531 /c/linux/fs/ubifs/dir.c <<ubifs_link>> 4 586 /c/linux/fs/ubifs/dir.c <<ubifs_unlink>> 5 675 /c/linux/fs/ubifs/dir.c <<ubifs_rmdir>> 6 731 /c/linux/fs/ubifs/dir.c <<ubifs_mkdir>> 7 803 /c/linux/fs/ubifs/dir.c <<ubifs_mknod>> 8 871 /c/linux/fs/ubifs/dir.c <<ubifs_symlink>> 9 1006 /c/linux/fs/ubifs/dir.c <<ubifs_rename>> 10 1009 /c/linux/fs/ubifs/dir.c <<ubifs_rename>> 11 246 /c/linux/fs/ubifs/file.c <<write_begin_slow>> 12 388 /c/linux/fs/ubifs/file.c <<allocate_budget>> 13 1125 /c/linux/fs/ubifs/file.c <<do_truncation>> <===== deadlockable 14 1217 /c/linux/fs/ubifs/file.c <<do_setattr>> 15 1381 /c/linux/fs/ubifs/file.c <<update_mctime>> 16 1486 /c/linux/fs/ubifs/file.c <<ubifs_vm_page_mkwrite>> 17 110 /c/linux/fs/ubifs/ioctl.c <<setflags>> 19 122 /c/linux/fs/ubifs/xattr.c <<create_xattr>> 20 201 /c/linux/fs/ubifs/xattr.c <<change_xattr>> 21 494 /c/linux/fs/ubifs/xattr.c <<remove_xattr>> It seems they are all safe except for ubifs. ubifs may actually deadlock from the above do_truncation() caller. However it should be fixable because the ubifs call for writeback_inodes_sb_nr() sounds very brute force writeback and wait and there may well be better way out. CCing ubifs developers for possible thoughts.. Thanks, Fengguang PS. I'll be on travel in the following week and won't have much time for replying emails. Sorry about that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>