On Sun, Dec 04, 2016 at 05:47:04PM -0800, Cyril Peponnet wrote: > > On Dec 4, 2016, at 5:22 PM, Dave Chinner <david@xxxxxxxxxxxxx> > > wrote: What deadlock is that? XFS is reporting memory allocation > > issues, not that there is a filesystem deadlock. Your comments > > that dropping caches make the problem go away indicate that > > there isn't any deadlock, just blocking on memory allocation > > that is taking a long time to resolve… > > You are right by deadlock I meant that mount point hang for ls > commands for instance (from the server it self) and basically the > glusterfs mount on the hypervisors is also hanging for all vms (it > makes the vms to remount RO). So everything is /blocked/, not deadlocked. If the memory allocation then makes progress, we're all ok. > On this server I have 1200 living snapshots, usually not too much > write in it except for some of them… > > Is there a way to optimize the memory allocation? Or we likely > need to add more ram… More RAM won't help - it's likely a memory fragmentation problem made worse by the fact that older kernels won't do memory compaction (i.e. memory defrag) when high order memory allocation fails. If the problem is file fragmentation, then the best you can probably do right now is limit fragmentation. Stopping qcow2 from fragmenting the crap out of the files will prevent the problem from re-occurring. This is the primary use case for extent size hints in XFS these days, though I know you can't actually control the gluster back end to use this. perhaps there are other tweaks to qcow formats (cluster size?) or gluster config to limit the amount of fragmentation.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html