Hi here is the details. The issue is on the scratch RAID array (used to store kvm snapshots). The other raid array is fine (no snapshot storage). • kernel version (uname -a) : Linux gluster05 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux • xfsprogs version (xfs_repair -V) : xfs_repair version 3.2.0-alpha2 • number of CPUs Bi Intel® Xeon® CPU E5-2630 v3 @ 2.40GHz (8 cores) • contents of /proc/meminfo MemTotal: 65699268 kB MemFree: 2058304 kB MemAvailable: 62753028 kB Buffers: 12 kB Cached: 57664044 kB SwapCached: 14840 kB Active: 26757700 kB Inactive: 31967204 kB Active(anon): 502064 kB Inactive(anon): 719452 kB Active(file): 26255636 kB Inactive(file): 31247752 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 4194300 kB SwapFree: 3679388 kB Dirty: 6804 kB Writeback: 120 kB AnonPages: 1048576 kB Mapped: 55104 kB Shmem: 160888 kB Slab: 3999548 kB SReclaimable: 3529220 kB SUnreclaim: 470328 kB KernelStack: 4240 kB PageTables: 9464 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 37043932 kB Committed_AS: 2200816 kB VmallocTotal: 34359738367 kB VmallocUsed: 422024 kB VmallocChunk: 34324311040 kB HardwareCorrupted: 0 kB AnonHugePages: 899072 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 111936 kB DirectMap2M: 7118848 kB DirectMap1G: 61865984 kB • contents of /proc/mounts rootfs / rootfs rw 0 0 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 devtmpfs /dev devtmpfs rw,nosuid,size=32842984k,nr_inodes=8210746,mode=755 0 0 securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0 tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0 tmpfs /sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,mode=755 0 0 cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0 pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0 cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0 cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0 cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0 configfs /sys/kernel/config configfs rw,relatime 0 0 /dev/mapper/system-root_vol / xfs rw,relatime,attr2,inode64,noquota 0 0 systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=36,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0 mqueue /dev/mqueue mqueue rw,relatime 0 0 debugfs /sys/kernel/debug debugfs rw,relatime 0 0 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 sunrpc /proc/fs/nfsd nfsd rw,relatime 0 0 /dev/sdc1 /boot ext4 rw,relatime,data=ordered 0 0 /dev/sda /export/raid/data xfs rw,noatime,nodiratime,attr2,nobarrier,inode64,logbsize=128k,sunit=256,swidth=1536,noquota 0 0 /dev/sdb /export/raid/scratch xfs rw,noatime,nodiratime,attr2,nobarrier,inode64,logbsize=128k,sunit=256,swidth=1024,noquota 0 0 binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0 • contents of /proc/partitions major minor #blocks name 8 0 11717836800 sda 8 16 7811891200 sdb 8 32 500107608 sdc 8 33 512000 sdc1 8 34 4194304 sdc2 8 35 495399936 sdc3 253 0 495386624 dm-0 • RAID layout (hardware and/or software) - Hardware VD LIST : ======= ---------------------------------------------------------------- DG/VD TYPE State Access Consist Cache Cac sCC Size Name ---------------------------------------------------------------- 0/0 RAID0 Optl RW Yes RAWBC - ON 7.275 TB scratch ---------------------------------------------------------------- Cac=CacheCade|Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|dgrd=Degraded Optl=Optimal|RO=Read Only|RW=Read Write|HD=Hidden|B=Blocked|Consist=Consistent| R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack| AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled Check Consistency Physical Drives = 4 PD LIST : ======= ----------------------------------------------------------------------------- EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp ----------------------------------------------------------------------------- 252:0 8 Onln 0 1.818 TB SATA HDD N N 512B WDC WD2000FYYZ-01UL1B2 U 252:1 10 Onln 0 1.818 TB SATA HDD N N 512B WDC WD2000FYYZ-01UL1B2 U 252:2 11 Onln 0 1.818 TB SATA HDD N N 512B WDC WD2000FYYZ-01UL1B2 U 252:3 9 Onln 0 1.818 TB SATA HDD N N 512B WDC WD2000FYYZ-01UL1B2 U ----------------------------------------------------------------------------- • xfs_info output on the filesystem in question xfs_info /export/raid/scratch/ meta-data=/dev/sdb isize=256 agcount=32, agsize=61030368 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 data = bsize=4096 blocks=1952971776, imaxpct=5 = sunit=32 swidth=128 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=32 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 • dmesg output showing all error messages and stack traces Nothing relevant in dmesg except several occurences of the following. [7649583.386283] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [7649585.370830] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [7649587.241290] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [7649589.243881] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) Thanks > On Dec 4, 2016, at 1:49 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Sat, Dec 03, 2016 at 11:08:58AM -0800, Cyril Peponnet wrote: >> Hi xfs community :), >> >> We have a glusterfs setup running under centos7.2. We are using >> xfs as underlying storage for the bricks. The volume is used to >> store vm snapshots that’s gets created dynamically from the >> hypervisors through glusterfs mount points. >> >> While this is working fine we have some issues from time to time >> that make the mount points hang. >> >> We have the famous XFS: possible memory allocation deadlock in >> kmem_alloc (mode:0x250) errors appearing from time to time on some >> of our gluster nodes. > > And the complete output in dmesg is? > >> Let me know if you need more information about our systems (like >> kernel version, underlying storage…) > > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html