Re: memory reclaim problems on fs usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Arkadiusz Miskiewicz wrote:
> Ok. In mean time I've tried 4.3.0 kernel + patches (the same as before + one 
> more) on second server which runs even more rsnapshot processes and also uses 
> xfs on md raid 6.
> 
> Patches:
> http://sprunge.us/DfIQ (debug patch from Tetsuo)
> http://sprunge.us/LQPF (backport of things from git + one from ml)
> 
> The problem is now with high order allocations probably:
> http://ixion.pld-linux.org/~arekm/log-mm-2srv-1.txt.gz

This seems to be stall upon allocating transparent huge pages.
(page fault -> try to allocate 2MB page -> reclaim -> waiting)

----------
[ 1166.110205] Node 0 Normal: 4801*4kB (UE) 1461*8kB (UMEC) 290*16kB (UMEC) 715*32kB (UM) 67*64kB (UME) 7*128kB (UMEC) 3*256kB (MEC) 6*512kB (UMEC) 3*1024kB (MEC) 10*2048kB (UME) 219*4096kB (M) = 988012kB
[ 1178.250751] Node 0 Normal: 4917*4kB (UME) 2622*8kB (UMEC) 530*16kB (UMC) 713*32kB (UME) 68*64kB (UM) 8*128kB (UMC) 5*256kB (MEC) 5*512kB (UMC) 8*1024kB (MEC) 4*2048kB (UME) 207*4096kB (M) = 945412kB
[ 1190.108587] Node 0 Normal: 4301*4kB (UME) 3132*8kB (UMEC) 670*16kB (UMEC) 766*32kB (UME) 74*64kB (UME) 11*128kB (UMC) 8*256kB (UMEC) 4*512kB (UMEC) 3*1024kB (MEC) 2*2048kB (UM) 206*4096kB (M) = 938676kB
[ 1202.014434] Node 0 Normal: 3971*4kB (UM) 2704*8kB (UMC) 650*16kB (UMEC) 754*32kB (UE) 68*64kB (UME) 6*128kB (UMC) 3*256kB (UEC) 3*512kB (UEC) 6*1024kB (MEC) 5*2048kB (UM) 206*4096kB (M) = 939628kB
[ 1212.438969] Node 0 Normal: 5307*4kB (UE) 4976*8kB (UEC) 1095*16kB (UC) 702*32kB (UM) 67*64kB (UM) 6*128kB (UMC) 2*256kB (UC) 37*512kB (UMC) 36*1024kB (MC) 29*2048kB (UME) 324*4096kB (M) = 1548892kB
[ 1222.840549] Node 0 Normal: 0*4kB 5*8kB (UMEC) 711*16kB (UC) 490*32kB (UME) 66*64kB (U) 6*128kB (UEC) 3*256kB (UMC) 3*512kB (UEC) 2*1024kB (MC) 1*2048kB (U) 303*4096kB (M) = 1279576kB
[ 1243.941537] Node 0 Normal: 28825*4kB (UE) 26405*8kB (UMEC) 12409*16kB (UMC) 3351*32kB (UME) 408*64kB (UM) 19*128kB (UMC) 3*256kB (UC) 4*512kB (UMC) 2*1024kB (UC) 3*2048kB (UME) 65*4096kB (M) = 938108kB
[ 1256.006097] Node 0 Normal: 40219*4kB (UME) 34873*8kB (UE) 17232*16kB (UME) 5633*32kB (UME) 598*64kB (UM) 1*128kB (M) 0*256kB 2*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 935252kB
[ 1268.951714] Node 0 Normal: 36607*4kB (UME) 37875*8kB (UME) 17450*16kB (UME) 5497*32kB (UME) 485*64kB (U) 0*128kB 0*256kB 1*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 936084kB
[ 1279.263718] Node 0 Normal: 39565*4kB (UME) 39229*8kB (UME) 17553*16kB (UME) 5174*32kB (UME) 279*64kB (UM) 2*128kB (ME) 4*256kB (ME) 3*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 939180kB
[ 1300.454484] Node 0 Normal: 37945*4kB (U) 52089*8kB (U) 24197*16kB (UE) 7420*32kB (UME) 493*64kB (UM) 10*128kB (UME) 1*256kB (M) 1*512kB (E) 2*1024kB (ME) 1*2048kB (E) 39*4096kB (M) = 1390524kB
[ 1310.761624] Node 0 Normal: 18034*4kB (U) 44842*8kB (UE) 19948*16kB (UE) 5383*32kB (UME) 162*64kB (UME) 2*128kB (ME) 3*256kB (ME) 5*512kB (ME) 1*1024kB (M) 0*2048kB 0*4096kB = 937272kB
[ 1320.848763] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=4574
[ 1321.988480] Node 0 Normal: 19294*4kB (UME) 45490*8kB (UE) 20125*16kB (UME) 5229*32kB (UM) 93*64kB (UM) 4*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 936888kB
[ 1342.042799] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=10944
[ 1352.791917] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=14172
[ 1364.143762] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=17581
[ 1374.499973] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=20691
[ 1384.842838] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=23797
[ 1395.192348] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=26905
[ 1405.525243] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=30008
[ 1415.831459] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=33103
[ 1427.113385] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=36491
[ 1437.416290] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=39585
[ 1437.428716] MemAlloc: rsnapshot(3639) gfp=0x4f52ca order=9 delay=3389
(...snipped...)
[30084.882928] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=3654551
[30085.744771] Node 0 Normal: 212267*4kB (UME) 10914*8kB (UE) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 936380kB
[30084.889563] MemAlloc: syslog-ng(1671) gfp=0x4f52ca order=9 delay=324074
[30106.734724] MemAlloc: cp(2795) gfp=0x4f52ca order=9 delay=3698251
[30106.747497] MemAlloc: syslog-ng(1671) gfp=0x4f52ca order=9 delay=330638
[30117.713596] MemAlloc: cp(2795) gfp=0x4f52ca order=9 delay=3701548
[30117.719721] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=3664412
[30117.726356] MemAlloc: syslog-ng(1671) gfp=0x4f52ca order=9 delay=333935
[30139.904472] MemAlloc: cp(2795) gfp=0x4f52ca order=9 delay=3708212
[30139.910594] MemAlloc: khugepaged(32) gfp=0xcf52ca order=9 delay=3671076
[30139.917237] MemAlloc: syslog-ng(1671) gfp=0x4f52ca order=9 delay=340599
[30172.684553] MemAlloc: cp(2795) gfp=0x4f52ca order=9 delay=3718056
[30173.749694] Node 0 Normal: 234069*4kB (UE) 148*8kB (UE) 113*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 939268kB
----------

So many 4kB pages but no 2048kB pages.

> System is doing very slow progress and for example depmod run took 2 hours
> http://sprunge.us/HGbE
> Sometimes I was able to ssh-in, dmesg took 10-15 minutes but sometimes it 
> worked fast for short period.
> 
> Ideas?

Memory fragmentation is out of my understanding.
Maybe disabling THP can help.

> 
> ps. I also had one problem with low order allocation but only once and wasn't 
> able to reproduce so far. I was running kernel with backport patches but no 
> debug patch, so got only this in logs:
> http://sprunge.us/WPXi

Unfortunately 16kB pages was not available at that moment.
But it was GFP_ATOMIC and infrequent failure should not become a problem.

----------
[ 8513.740326] swapper/3: page allocation failure: order:2, mode:0xc020
(...snipped...)
[ 8514.247719] Node 0 Normal: 207920*4kB (UE) 18052*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 976096kB
----------

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]