Hi all, Since a couple of days, my storage server keeps reporting the following messages: XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) I am not quite sure about where's the issue from, as there's still some free memory left. The only way to make it disappear (temporary) would be by using: echo 2 > /proc/sys/vm/drop_caches Please find below the details about the machine: [root@storage02 ~]# cat /etc/redhat-release CentOS Linux release 7.3.1611 (Core) [root@storage02 ~]# uname -r 3.10.0-514.26.2.el7.x86_64 [root@storage02 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 893.8G 0 disk sdb 8:16 0 58.2T 0 disk sdc 8:32 0 58.2T 0 disk sdd 8:48 0 893.8G 0 disk /beegfs/j4-meta1 sde 8:64 0 58.2T 0 disk /beegfs/j4-stor1 sdf 8:80 0 58.2T 0 disk /beegfs/j4-stor2 sdg 8:96 0 893.8G 0 disk /beegfs/j2-meta1 sdh 8:112 0 58.2T 0 disk /beegfs/j2-stor1 sdi 8:128 0 58.2T 0 disk /beegfs/j2-stor2 sdj 8:144 0 893.8G 0 disk sdk 8:160 0 58.2T 0 disk sdl 8:176 0 58.2T 0 disk sdm 8:192 0 110.8G 0 disk ├─sdm1 8:193 0 256M 0 part /boot ├─sdm2 8:194 0 11.1G 0 part [SWAP] └─sdm3 8:195 0 99.5G 0 part └─system-root 253:0 0 99.5G 0 lvm / [root@storage02 ~]# uname -r 3.10.0-514.26.2.el7.x86_64 [root@storage02 ~]# rpm -qa | grep kernel kernel-devel-3.10.0-514.26.2.el7.x86_64 kernel-tools-libs-3.10.0-514.26.2.el7.x86_64 kernel-3.10.0-514.26.2.el7.x86_64 kernel-3.10.0-327.36.3.el7.x86_64 kernel-tools-3.10.0-514.26.2.el7.x86_64 kmod-ifs-kernel-updates-3.10.0_514.26.2.el7.x86_64-535.x86_64 ifs-kernel-updates-devel-3.10.0_514.26.2.el7.x86_64-535.x86_64 kernel-devel-3.10.0-327.36.3.el7.x86_64 kernel-headers-3.10.0-514.26.2.el7.x86_64 [root@storage02 ~]# rpm -qa | grep xfs xfsprogs-4.5.0-10.el7_3.x86_64 [root@storage02 ~]# cat /proc/sys/vm/dirty_background_ratio 1 [root@storage02 ~]# cat /proc/sys/vm/dirty_ratio 75 [root@storage02 ~]# cat /proc/sys/vm/vfs_cache_pressure 50 rc.local: #BeeGFS tuning - storage targets for i in sdb sdc sde sdf sdh sdi sdk sdl; do echo deadline > /sys/block/$i/queue/scheduler echo 4096 > /sys/block/$i/queue/nr_requests echo 4096 > /sys/block/$i/queue/read_ahead_kb done #BeeGFS tuning - meta targets for i in sda sdd sdg sdj; do echo deadline > /sys/block/$i/queue/scheduler echo 128 > /sys/block/$i/queue/nr_requests done echo always > /sys/kernel/mm/transparent_hugepage/enabled echo always > /sys/kernel/mm/transparent_hugepage/defrag The error XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:1(21677) possible memory allocation deadlock size 33568 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33056 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33072 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33072 in kmem_alloc (mode:0x250) XFS: kworker/u16:2(4526) possible memory allocation deadlock size 33584 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33072 in kmem_alloc (mode:0x250) XFS: kworker/u16:0(27932) possible memory allocation deadlock size 33072 in kmem_alloc (mode:0x250) Fix: echo 2 > /proc/sys/vm/drop_caches Info about the FSs using XFS: [root@storage02 ~]# xfs_info /dev/sde meta-data=/dev/sde isize=512 agcount=59, agsize=268435328 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 spinodes=0 data = bsize=4096 blocks=15625879552, imaxpct=1 = sunit=128 swidth=2048 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@storage02 ~]# xfs_info /dev/sdf meta-data=/dev/sdf isize=512 agcount=59, agsize=268435328 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 spinodes=0 data = bsize=4096 blocks=15625879552, imaxpct=1 = sunit=128 swidth=2048 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@storage02 ~]# xfs_info /dev/sdi meta-data=/dev/sdi isize=512 agcount=59, agsize=268435392 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 spinodes=0 data = bsize=4096 blocks=15625879552, imaxpct=1 = sunit=64 swidth=1024 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@storage02 ~]# xfs_info /dev/sdh meta-data=/dev/sdh isize=512 agcount=59, agsize=268435392 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 spinodes=0 data = bsize=4096 blocks=15625879552, imaxpct=1 = sunit=64 swidth=1024 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=64 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@storage02 ~]# mount | grep beegfs /dev/sdd on /beegfs/j4-meta1 type ext4 (rw,noatime,nodiratime,nobarrier,data=ordered) /dev/sdg on /beegfs/j2-meta1 type ext4 (rw,noatime,nodiratime,nobarrier,data=ordered) /dev/sde on /beegfs/j4-stor1 type xfs (rw,noatime,nodiratime,swalloc,attr2,largeio,nobarrier,inode64,logbufs=8,logbsize=256k,sunit=1024,swidth=16384,usrquota,gqnoenforce) /dev/sdf on /beegfs/j4-stor2 type xfs (rw,noatime,nodiratime,swalloc,attr2,largeio,nobarrier,inode64,logbufs=8,logbsize=256k,sunit=1024,swidth=16384,usrquota,gqnoenforce) /dev/sdi on /beegfs/j2-stor2 type xfs (rw,noatime,nodiratime,swalloc,attr2,largeio,nobarrier,inode64,logbufs=8,logbsize=256k,sunit=512,swidth=8192,usrquota,gqnoenforce) /dev/sdh on /beegfs/j2-stor1 type xfs (rw,noatime,nodiratime,swalloc,attr2,largeio,nobarrier,inode64,logbufs=8,logbsize=256k,sunit=512,swidth=8192,usrquota,gqnoenforce) While the issue was present, i've tried to gather some data: [root@storage02 ~]# dmesg [18038038.420617] XFS: kworker/u16:4(14255) possible memory allocation deadlock size 33712 in kmem_alloc (mode:0x250) [18038039.884236] XFS: kworker/u16:4(14255) possible memory allocation deadlock size 33712 in kmem_alloc (mode:0x250) [18038041.894279] XFS: kworker/u16:4(14255) possible memory allocation deadlock size 33712 in kmem_alloc (mode:0x250) [root@storage02 ~]# free -mh total used free shared buff/cache available Mem: 62G 4.3G 979M 19M 57G 56G Swap: 11G 211M 10G [root@storage02 ~]# cat /proc/buddyinfo Node 0, zone DMA 1 0 0 1 1 0 0 0 0 1 2 Node 0, zone DMA32 2311 4049 12054 1094 3 0 0 0 0 0 0 Node 0, zone Normal 100457 23750 5087 1342 1 0 0 0 0 0 0 [root@storage02 ~]# ps aux | grep " D " root 14255 0.2 0.0 0 0 ? D 10:22 0:01 [kworker/u16:4] root 15729 0.0 0.0 0 0 ? D 10:28 0:00 [kworker/2:42] root 15732 0.0 0.0 0 0 ? D 10:28 0:00 [kworker/2:43] root 15734 0.0 0.0 0 0 ? D 10:28 0:00 [kworker/2:44] root 15735 0.0 0.0 0 0 ? D 10:28 0:00 [kworker/2:45] root 16508 0.0 0.0 112648 968 pts/1 S+ 10:31 0:00 grep --color=auto D [root@storage02 ~]# vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 58 1 216804 1000156 5714032 54494696 0 0 1228 3505 0 0 2 4 93 1 0 [root@storage02 ~]# vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 1 216804 986720 5714132 54506996 0 0 1228 3505 0 0 2 4 93 1 0 [root@storage02 ~]# vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 63 1 216804 1016120 5714176 54478188 0 0 1228 3505 0 0 2 4 93 1 0 [root@storage02 ~]# vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 1 216804 1012572 5714196 54482276 0 0 1228 3505 0 0 2 4 93 1 0 [root@storage02 ~]# vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 216804 884428 5714688 54606884 0 0 1228 3505 0 0 2 4 93 1 0 0 0 216804 869488 5714744 54621520 0 0 4 2248 90864 280824 7 16 76 0 0 0 0 216804 854532 5714828 54636236 0 0 4 7188 68939 215948 6 13 81 1 0 0 0 216804 839072 5714900 54651260 0 0 12 66796 121773 390372 10 23 67 0 0 0 0 216804 824832 5714948 54665660 0 0 0 0 65629 182502 4 11 85 0 0 1 0 216804 806356 5715004 54679580 0 0 8 10296 108415 337787 9 20 71 0 0 0 0 216804 792212 5715088 54694268 0 0 0 2340 129698 400597 10 24 67 0 0 0 0 216804 777452 5715144 54708688 0 0 0 3252 92331 282641 8 16 76 0 0 39 0 216804 763512 5715700 54722608 0 0 540 5408 84641 265557 7 15 78 0 0 0 0 216804 749560 5715748 54737136 0 0 0 0 117451 385523 10 23 67 0 0 0 0 216804 734860 5715792 54751596 0 0 0 0 125409 391832 9 24 67 0 0 0 0 216804 720656 5715852 54766136 0 0 0 2476 78634 234247 6 14 80 0 0 0 0 216804 706948 5715920 54779868 0 0 24 10756 85675 270733 7 16 77 0 0 0 0 216804 693056 5715956 54793888 0 0 4 6844 122175 389915 10 23 67 0 0 41 0 216804 678724 5716012 54808844 0 0 0 0 100977 312065 8 18 74 0 0 0 0 216804 664360 5716088 54822964 0 0 0 0 79725 250138 6 14 79 0 0 0 0 216804 649820 5716144 54837296 0 0 0 2288 114461 378085 11 22 68 0 0 42 0 216804 635924 5716204 54851240 0 0 0 10652 97495 298287 7 18 75 0 0 0 0 216804 621596 5716240 54865476 0 0 8 12432 86449 261807 7 15 78 0 0 [Mon Apr 23 10:41:11 2018] XFS: kworker/u16:4(14255) possible memory allocation deadlock size 33200 in kmem_alloc (mode:0x250) [Mon Apr 23 10:41:13 2018] XFS: kworker/u16:4(14255) possible memory allocation deadlock size 33200 in kmem_alloc (mode:0x250) [root@storage02 ~]# date Mon Apr 23 10:35:17 UTC 2018 [root@storage02 ~]# vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 1 216804 1010596 5727768 54464024 0 0 1228 3505 0 0 2 4 93 1 0 [root@storage02 ~]# vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 1 216804 990480 5727912 54483820 0 0 1228 3505 0 0 2 4 93 1 0 1 1 216804 1018588 5727944 54456272 0 0 0 4 65812 181488 5 10 75 11 0 0 1 216804 1010200 5728004 54464252 0 0 0 2276 95495 323022 8 20 63 9 0 0 1 216804 1002040 5728076 54472092 0 0 0 20 126184 395655 10 23 59 8 0 0 1 216804 993956 5728164 54480232 0 0 0 7500 83897 236594 6 13 71 10 0 0 1 216804 985940 5728212 54488112 0 0 0 0 81325 268329 7 16 67 10 0 0 1 216804 1001508 5728248 54472076 0 0 0 8 125449 395975 10 23 59 8 0 1 1 216804 993664 5728324 54479908 0 0 0 2344 113745 346333 9 20 62 9 0 0 1 216804 985376 5728384 54488012 0 0 0 0 56207 172372 5 10 75 11 0 0 1 216804 1014060 5728456 54460392 0 0 0 3216 121451 388402 10 22 59 9 0 1 1 216804 1005640 5728508 54468004 0 0 24 23228 97042 281593 7 17 67 9 0 0 1 216804 996884 5728560 54476676 0 0 0 0 70147 222651 6 13 71 10 0 20 1 216804 988500 5728632 54484516 0 0 36 5872 129464 406727 10 24 58 8 0 36 1 216804 1016252 5728688 54457816 0 0 4 276 112221 350346 9 21 62 9 0 0 1 216804 1008208 5728752 54465104 0 0 0 3076 75175 222589 5 12 72 10 0 1 0 216804 1000872 5728804 54473236 0 0 16 203248 120988 376914 10 22 60 8 0 3 1 216804 1672876 5728852 53799244 0 0 4 4 135322 404492 10 25 57 8 0 0 1 216804 8276484 5728892 47196952 0 0 28 2320 15461 20558 1 11 76 12 0 0 1 216804 8278968 5728912 47197548 0 0 0 0 3449 4218 0 0 87 13 0 0 1 216804 8278316 5728940 47198152 0 0 0 7220 4725 4765 0 0 87 13 0 0 1 216804 8277780 5728960 47198788 0 0 0 4112 4895 5154 0 0 87 12 0 0 1 216804 8276972 5728972 47199444 0 0 0 4560 4198 5250 0 1 87 12 0 0 1 216804 8277036 5729000 47199360 0 0 96 6868 4027 4939 0 0 87 12 0 0 1 216804 8276372 5729016 47199876 0 0 0 4 3148 3896 0 0 87 12 0 1 1 216804 8276004 5729028 47200336 0 0 0 0 3020 3803 0 0 87 12 0 0 1 216804 8276344 5729036 47199936 0 0 52 7916 2695 3601 0 0 87 12 0 0 1 216804 8276200 5729040 47200076 0 0 0 0 1381 1782 0 0 87 12 0 0 1 216804 8276084 5729064 47200136 0 0 0 1136 6262 6581 0 0 87 12 0 0 1 216804 8276076 5729068 47200140 0 0 0 0 999 1366 0 0 87 13 0 0 1 216804 8276076 5729068 47200140 0 0 0 0 687 900 0 0 87 12 0 0 1 216804 8276076 5729068 47200140 0 0 0 0 440 660 0 0 88 13 0 0 1 216804 8276108 5729068 47200108 0 0 0 0 426 644 0 0 87 12 0 0 1 216804 8276136 5729068 47200076 0 0 0 0 1303 1256 0 0 87 12 0 0 1 216804 8276132 5729076 47200076 0 0 0 192 516 702 0 0 87 12 0 0 1 216804 8276164 5729076 47200044 0 0 0 2764 1065 919 0 0 88 12 0 0 1 216804 8276164 5729076 47200044 0 0 0 0 491 683 0 0 87 12 0 ^C [root@storage02 ~]# ps aux | grep " D " root 14255 0.5 0.0 0 0 ? D 10:22 0:04 [kworker/u16:4] root 15649 0.0 0.0 0 0 ? D 10:28 0:00 [kworker/3:1] root 15686 0.0 0.0 0 0 ? D 10:28 0:00 [kworker/3:3] root 15717 0.0 0.0 0 0 ? D 10:28 0:00 [kworker/3:5] root 16627 0.0 0.0 0 0 ? D 10:32 0:00 [kworker/3:9] root 17614 0.0 0.0 112648 964 pts/1 S+ 10:36 0:00 grep --color=auto D [root@storage02 ~]# cat /proc/buddyinfo Node 0, zone DMA 1 0 0 1 1 0 0 0 0 1 2 Node 0, zone DMA32 2964 4379 13405 954 4 0 0 0 0 0 0 Node 0, zone Normal 490525 563348 71048 11427 49 0 0 0 0 0 0 [root@storage02 ~]# free -mh total used free shared buff/cache available Mem: 62G 4.3G 7.9G 19M 50G 56G Swap: 11G 211M 10G Does anybody have any idea about what might be wrong here? I would suspect a kernel bug. We couldn't try another version yet. Regards, -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html