Hi, We've run into a somewhat unexpected condition. Under high memory pressure and high I/O write pressure on slow media, when doing network calls, we have a call chain that looks like: .. -> tcp_recvmsg -> .. -> do_page_fault -> .. -> __alloc_pages_slowpath -> try_to_free_pages -> .. -> shrink_slab -> super_cache_scan -> xfs_fs_free_cached_objects -> xfs_reclaim_inodes_nr -> xfs_reclaim_inodes_ag -> mutex_lock -> __mutex_lock_slowpath And it stays stuck there. This causes the network traffic to stall, which causes applications (in this case Ceph OSDs) to fail basic health checks. This particular call-chain is due to the end of xfs_reclaim_inodes_ag, which has the code: if (skipped && (flags & SYNC_WAIT) && *nr_to_scan > 0) { trylock = 0; goto restart; } The code first tries to release with trylock on the mutex, but it it fails to release sufficient number of items, and there were groups that it failed to lock, it tries again with blocking locks. If another kernel thread holds the mutexes for any reason (such as currently flushing the group), we essentially make kernel memory allocation wait for disc I/O. On this particular system, we have 30 other XFS filesystems also mounted, and there's also a lot of non-XFS caches that could be reclaimed to meet this memory request. There's about 100GB of other caches that could be released, so why block? We've worked around this with the following probe: probe module("xfs").function("xfs_reclaim_inodes_ag").call { printf ("%s -> %s: %d %s [%s]\n", thread_indent(0), probefunc(), kernel_int($nr_to_scan),kernel_string($mp->m_fsname), $$parms) print_backtrace() $flags = $flags & 2 } In other words, remove the SYNC_WAIT flag to the call. This causes the slab shrinker to move on to the next candidate for releasing. So far, this seems to fix all the problems we've seen. The probe could probably be improved to only do this for the callchain that reaches xfs_reclaim_inodes_ag from shrink_slab. Is there a better way to fix this problem? - Thorvald -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html