Re: XFS: possible memory allocation deadlock in kmem_alloc on glusterfs setup

Cyril Peponnet <cyril.peponnet@xxxxxxxxxxxxxxxxx> · Sun, 4 Dec 2016 14:07:18 -0800

Hi here is the details. The issue is on the scratch RAID array (used to store kvm snapshots). The other raid array is fine (no snapshot storage).

	• kernel version (uname -a) : Linux gluster05 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

	• xfsprogs version (xfs_repair -V) : xfs_repair version 3.2.0-alpha2

	• number of CPUs Bi Intel® Xeon® CPU E5-2630 v3 @ 2.40GHz (8 cores)

	• contents of /proc/meminfo
MemTotal:       65699268 kB
MemFree:         2058304 kB
MemAvailable:   62753028 kB
Buffers:              12 kB
Cached:         57664044 kB
SwapCached:        14840 kB
Active:         26757700 kB
Inactive:       31967204 kB
Active(anon):     502064 kB
Inactive(anon):   719452 kB
Active(file):   26255636 kB
Inactive(file): 31247752 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       4194300 kB
SwapFree:        3679388 kB
Dirty:              6804 kB
Writeback:           120 kB
AnonPages:       1048576 kB
Mapped:            55104 kB
Shmem:            160888 kB
Slab:            3999548 kB
SReclaimable:    3529220 kB
SUnreclaim:       470328 kB
KernelStack:        4240 kB
PageTables:         9464 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    37043932 kB
Committed_AS:    2200816 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      422024 kB
VmallocChunk:   34324311040 kB
HardwareCorrupted:     0 kB
AnonHugePages:    899072 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      111936 kB
DirectMap2M:     7118848 kB
DirectMap1G:    61865984 kB

	• contents of /proc/mounts
rootfs / rootfs rw 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=32842984k,nr_inodes=8210746,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/mapper/system-root_vol / xfs rw,relatime,attr2,inode64,noquota 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=36,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
sunrpc /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sdc1 /boot ext4 rw,relatime,data=ordered 0 0
/dev/sda /export/raid/data xfs rw,noatime,nodiratime,attr2,nobarrier,inode64,logbsize=128k,sunit=256,swidth=1536,noquota 0 0
/dev/sdb /export/raid/scratch xfs rw,noatime,nodiratime,attr2,nobarrier,inode64,logbsize=128k,sunit=256,swidth=1024,noquota 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0

	• contents of /proc/partitions
major minor  #blocks  name

   8        0 11717836800 sda
   8       16 7811891200 sdb
   8       32  500107608 sdc
   8       33     512000 sdc1
   8       34    4194304 sdc2
   8       35  495399936 sdc3
 253        0  495386624 dm-0
	• RAID layout (hardware and/or software) - Hardware
VD LIST :
=======

----------------------------------------------------------------
DG/VD TYPE  State Access Consist Cache Cac sCC     Size Name
----------------------------------------------------------------
0/0   RAID0 Optl  RW     Yes     RAWBC -   ON  7.275 TB scratch
----------------------------------------------------------------

Cac=CacheCade|Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|dgrd=Degraded
Optl=Optimal|RO=Read Only|RW=Read Write|HD=Hidden|B=Blocked|Consist=Consistent|
R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack|
AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled
Check Consistency

Physical Drives = 4

PD LIST :
=======

-----------------------------------------------------------------------------
EID:Slt DID State DG     Size Intf Med SED PI SeSz Model                  Sp
-----------------------------------------------------------------------------
252:0     8 Onln   0 1.818 TB SATA HDD N   N  512B WDC WD2000FYYZ-01UL1B2 U
252:1    10 Onln   0 1.818 TB SATA HDD N   N  512B WDC WD2000FYYZ-01UL1B2 U
252:2    11 Onln   0 1.818 TB SATA HDD N   N  512B WDC WD2000FYYZ-01UL1B2 U
252:3     9 Onln   0 1.818 TB SATA HDD N   N  512B WDC WD2000FYYZ-01UL1B2 U
-----------------------------------------------------------------------------

	• xfs_info output on the filesystem in question

xfs_info /export/raid/scratch/
meta-data=/dev/sdb               isize=256    agcount=32, agsize=61030368 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0
data     =                       bsize=4096   blocks=1952971776, imaxpct=5
         =                       sunit=32     swidth=128 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=32 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

	• dmesg output showing all error messages and stack traces

Nothing relevant in dmesg except several occurences of the following.

[7649583.386283] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[7649585.370830] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[7649587.241290] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
[7649589.243881] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

Thanks

> On Dec 4, 2016, at 1:49 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
> On Sat, Dec 03, 2016 at 11:08:58AM -0800, Cyril Peponnet wrote:
>> Hi xfs community :),
>> 
>> We have a glusterfs setup running under centos7.2. We are using
>> xfs as underlying storage for the bricks.  The volume is used to
>> store vm snapshots that’s gets created dynamically from the
>> hypervisors through glusterfs mount points.
>> 
>> While this is working fine we have some issues from time to time
>> that make the mount points hang.
>> 
>> We have the famous XFS: possible memory allocation deadlock in
>> kmem_alloc (mode:0x250) errors appearing from time to time on some
>> of our gluster nodes.
> 
> And the complete output in dmesg is?
> 
>> Let me know if you need more information about our systems (like
>> kernel version, underlying storage…)
> 
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> 
> -Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html