On 21.04.2017 11:18, Shan Hai wrote: > >> >> I also noticed that unmounting the file system takes a really long time >> after the problem occured (up to around 5 minutes!). Even when there was >> nothing at all going on before the unmount. Would it help to capture the >> unmount with trace-cmd? > > A huge negative dcache entries would cause a slow umount, > please check it by 'cat /proc/sys/fs/dentry-state', the first column > is for used(active) entries while the second is for the unused > (negative) entries. > Does this qualify as "huge"? # cat /proc/sys/fs/dentry-state 10689551 10539266 45 0 0 0 > It can be dropped by 'echo 2 > /proc/sys/vm/drop_caches'. > Won't that drop all dentries and inodes from the cache, which is what I should be trying to avoid? >> Here is another theory. Could it be that not the rsync's, but the rm's >> issued by rsnapshot are causing the problem? Would it help to serialize >> all "rm -Rf" calls? Those always delete the oldest backups, which can't >> possibly be in the cache and because of that all those inodes need to be >> read into memory during deletion. Maybe those rm's are filling up the >> XFS log? >> Answering myself here: after removing the "rm" calls from rsnapshot the problem completely disappeared. I've now written a separate script which processes all "rm -Rf *" calls sequentially. I would deduce now that parallel deletion of big directory trees were pushing the filesystem (and xfsaild) so hard to cause those 10-15 minute stalls. Thanks to everyone who contributed to help me solve the problem. Have a nice weekend, Michael -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html