> On Dec 5, 2016, at 1:45 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Mon, Dec 05, 2016 at 07:51:45AM -0800, Cyril Peponnet wrote: >> I had the issue again but I don’t have more output in dmesg or >> journalctl even with the echo 11 > /proc/sys/fs/xfs/error_level >> set. > > Which means your kernel does not have this commit: > > commit 847f9f6875fb02b576035e3dc31f5e647b7617a7 > Author: Eric Sandeen <sandeen@xxxxxxxxxx> > Date: Mon Oct 12 16:04:45 2015 +1100 > > xfs: more info from kmem deadlocks and high-level error msgs > > In an effort to get more useful out of "possible memory > allocation deadlock" messages, print the size of the > requested allocation, and dump the stack if the xfs error > level is tuned high. > > The stack dump is implemented in define_xfs_printk_level() > for error levels >= LOGLEVEL_ERR, partly because it > seems generically useful, and also because kmem.c has > no knowledge of xfs error level tunables or other such bits, > it's very kmem-specific. > > Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> > Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> > Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> Indeed we should plan an upgrade window. > >> Is there another location where I should look at ? > > Nope, there's nothing in your kernel we can use to identify the > source of memory allocations. I'm pretty sure that RH have used > systemtap scripts to pull this information from these kernels for > RHEL customers - we've added additional debug help here to avoid > that need, but your kernel doesn't have that code.... > > Essentially, best guess is that it's file fragmentation causing > problems with extent list allocation. Finding out why that one > snapshot is fragmenting so much and mitigating it is probably the > only thing you can do right now (i.e. extent size hints). Long term > is to get gluster to do the mitigation for VM images automatically. > Looks like it’s better since I disabled the vm that was taking a lot of disk space: qemu-img info disk0.snapshot.qcow2 image: disk0.snapshot.qcow2 file format: qcow2 virtual size: 265G (284541583360 bytes) disk size: 798G cluster_size: 65536 backing file: base.qcow2 Note the virtual size vs the disk size, looks pretty fragmented. I will follow up with glusters guys. One dumb question, can the extent size hint be done at the root level ? This way all new files will have the extent size hint by inheritance. Maybe that’s overkill or simply will not work. Just wanted to know :) Anyway thanks Dave for all the detailed answers you provided to me. That was really help full. > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html