drop_caches does not help so it's a permanent leak. Yeah, we're looking into updating out tree to a newer one. Shantanu > On Friday, July 25, 2014 11:17 AM, Milosz Tanski <milosz@xxxxxxxxx> wrote: > >T he question is? Does it become un-reclaimable period (it's still full > when you force the kernel to dump caches)... or it's unreclaimable by > other processes allocating via GFP_NOFS? > > Also, you should be able to port the fscache changes onto you're tree > by looking David's commit history. > > - Milosz > > On Fri, Jul 25, 2014 at 11:15 AM, Shantanu Goel <sgoel01@xxxxxxxxx> wrote: >> Our codebase is a bit older than 3.15.5 so the only patch I was able to > integrate immediately and deploy is the one which papers over the > wait_on_page_write deadlock. We are still evaluating the others. The > underlying problem of a page reference leak remains which I'm still trying > to chase down. It literally gets to the point where the entire file page cache > becomes unreclaimable. >> >> Thanks, >> Shantanu >> >> >>> On Friday, July 25, 2014 11:04 AM, Milosz Tanski > <milosz@xxxxxxxxx> wrote: >>> > Did you have a chance to try any of my patches? >>> >>> On Tue, Jul 22, 2014 at 10:11 AM, Milosz Tanski > <milosz@xxxxxxxxx> wrote: >>>> Shantanu, >>>> >>>> You are correct in the way that there is a correlation between > those >>>> messages and fscache hangs. The correlation is that there are a > number >>>> of bugs in fscache code when things fail (like when allocation > fails). >>>> I have 3 different patches that I'm going to submit later on > for >>>> review for these issues against 3.15.5. I'll post them late on > today >>>> when I get a free minute. >>>> >>>> Also, sorry for not CCing the list in the first place. >>>> >>>> On Tue, Jul 22, 2014 at 8:56 AM, Shantanu Goel > <sgoel01@xxxxxxxxx> >>> wrote: >>>>> Hi Milosz, >>>>> >>>>> We thought they were harmless too initially, however we have > seen quite >>> a few hangs at this point on hosts where we enabled fscache + > cachefiles and >>> think there is some correlation with these messages. It could be some > generic >>> VM issue which is exacerbated by the use of fscache. We posted our > initial >>> message to get some ideas about where the problem might be since we are > not >>> certain of where the problem lies. >>>>> >>>>> Thanks, >>>>> Shantanu >>>>> >>>>> >>>>> >>>>> >>>>>> On Monday, July 21, 2014 9:31 PM, Milosz Tanski >>> <milosz@xxxxxxxxx> wrote: >>>>>> > Shantanu, >>>>>> >>>>>> This message is harmless. I see this a lot a well as we > have >>> fscache >>>>>> on a SSD disks that do 1.1GB/s transfer and a demanding > application >>>>>> and a lot of pages waiting for write out. The kernel > cannot force a >>>>>> write out because fscache has t o allocate pages with the > GPF_NOFS >>>>>> flag (to prevent a recursive hang). >>>>>> >>>>>> On our production machines we changed the > vm.min_free_kbytes kernel >>>>>> tunable to 256Mb and (the machine has a lot of RAM) and > this >>> happens a >>>>>> lot less often. >>>>>> >>>>>> - Milosz >>>>>> >>>>>> On Wed, Jul 9, 2014 at 12:35 PM, Shantanu Goel >>> <sgoel01@xxxxxxxxx> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> We are running Linux 3.10 + fscache-20130702 + commit >>>>>> bae6b235905ab9dc6659395f7802c1d36fb63f15 from > dhowells' git >>> tree and suspect >>>>>> there might be a page reference leak somewhere as > evidenced by the >>> low reclaim >>>>>> ratio. We also see the kernel complaining about memory > allocation >>> failures in >>>>>> the readahead code as seen below. >>>>>>> The machine has plenty of filecache and there > isn't any >>> heavy write >>>>>> activity which could also pin the pages. >>>>>>> We wrote a script to monitor /proc/vmstat and see the >>> following output with >>>>>> a one second interval. >>>>>>> >>>>>>> >>>>>>> Refill Scan Free FrRatio Slab > Runs >>> Stalls >>>>>>> 0 154663 6855 4 1024 > 2 >>> 0 >>>>>>> 0 7152 7152 100 0 > 4 >>> 0 >>>>>>> 0 5960 5960 100 0 > 4 >>> 0 >>>>>>> 0 7152 7152 100 1024 > 3 >>> 0 >>>>>>> 0 152407 21698 14 1024 > 2 >>> 0 >>>>>>> >>>>>>> >>>>>>> The fields are: >>>>>>> >>>>>>> Refill - # active pages scanned, i.e. deactivated >>>>>>> Scan - # inactive pages scanned >>>>>>> Free - # pages freed >>>>>>> FrRatio - free / scan ratio as percentage >>>>>>> Runs - # times kswapd ran >>>>>>> Stalls - # times direct reclaim ran >>>>>>> >>>>>>> >>>>>>> The free memory looks like this (free -m): >>>>>>> >>>>>>> total used free shared >>> buffers cached >>>>>>> Mem: 24175 23986 188 0 >>> 16 10453 >>>>>>> -/+ buffers/cache: 13517 10657 >>>>>>> Swap: 8191 85 8106 >>>>>>> >>>>>>> >>>>>>> Are you aware of any such issue with fscache / > cachefiles? If >>> not, could >>>>>> you suggest what other information we could gather to > debug it >>> further? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Shantanu >>>>>>> SysRq : HELP : loglevel(0-9) reboot(b) crash(c) >>> terminate-all-tasks(e) >>>>>> memory-full-oom-kill(f) kill-all-tasks(i) > thaw-filesystems(j) >>> sak(k) >>>>>> show-backtrace-all-active-cpus(l) show-memory-usage(m) >>> nice-all-RT-tasks(n) >>>>>> poweroff(o) show-registers(p) show-all-timers(q) unraw(r) > sync(s) >>>>>> show-task-states(t) unmount(u) show-blocked-tasks(w) >>> dump-ftrace-buffer(z) >>>>>>> python: page allocation failure: order:0, > mode:0x11110 >>>>>>> CPU: 5 PID: 18997 Comm: python Tainted: G W O >>>>>> 3.10.36-el5.ia32e.lime.0 #1 >>>>>>> Hardware name: Supermicro X8STi/X8STi, BIOS 2.0 > 06/03/2010 >>>>>>> 0000000000000000 ffff880104d39578 ffffffff81426c43 >>> ffff880104d39608 >>>>>>> ffffffff810fdcfc ffff88061fffd468 0000004000000000 >>> ffff880104d395a8 >>>>>>> ffffffff810fd506 0000000000000000 ffffffff8109120d >>> 0001111000000000 >>>>>>> Call Trace: >>>>>>> [<ffffffff81426c43>] dump_stack+0x19/0x1e >>>>>>> [<ffffffff810fdcfc>] > warn_alloc_failed+0xfc/0x140 >>>>>>> [<ffffffff810fd506>] ? > drain_local_pages+0x16/0x20 >>>>>>> [<ffffffff8109120d>] ? > on_each_cpu_mask+0x4d/0x70 >>>>>>> [<ffffffff810fef1c>] > __alloc_pages_nodemask+0x6cc/0x910 >>>>>>> [<ffffffff81139a9a>] > alloc_pages_current+0xba/0x160 >>>>>>> [<ffffffff810f6617>] > __page_cache_alloc+0xa7/0xc0 >>>>>>> [<ffffffffa050a4ea>] >>> cachefiles_read_backing_file+0x2ba/0x7b0 >>>>>> [cachefiles] >>>>>>> [<ffffffffa050ac87>] >>> cachefiles_read_or_alloc_pages+0x2a7/0x3d0 >>>>>> [cachefiles] >>>>>>> [<ffffffff81062c4f>] ? wake_up_bit+0x2f/0x40 >>>>>>> [<ffffffffa04da24a>] ? > fscache_run_op+0x5a/0xa0 >>> [fscache] >>>>>>> [<ffffffffa04daacb>] ? > fscache_submit_op+0x1db/0x4c0 >>> [fscache] >>>>>>> [<ffffffffa04dbd65>] >>> __fscache_read_or_alloc_pages+0x1f5/0x2c0 >>>>>> [fscache] >>>>>>> [<ffffffffa0531e9e>] >>> __nfs_readpages_from_fscache+0x7e/0x1b0 [nfs] >>>>>>> [<ffffffffa052bc6a>] nfs_readpages+0xca/0x1f0 > [nfs] >>>>>>> [<ffffffff81139a9a>] ? > alloc_pages_current+0xba/0x160 >>>>>>> [<ffffffff81101ae2>] >>> __do_page_cache_readahead+0x1b2/0x260 >>>>>>> [<ffffffff81101bb1>] ra_submit+0x21/0x30 >>>>>>> [<ffffffff81101e85>] > ondemand_readahead+0x115/0x240 >>>>>>> [<ffffffff81102038>] >>> page_cache_async_readahead+0x88/0xb0 >>>>>>> [<ffffffff810f5f5e>] ? find_get_page+0x1e/0xa0 >>>>>>> [<ffffffff810f81fc>] > generic_file_aio_read+0x4dc/0x720 >>>>>>> [<ffffffffa05224b9>] nfs_file_read+0x89/0x100 > [nfs] >>>>>>> [<ffffffff81158a4f>] do_sync_read+0x7f/0xb0 >>>>>>> [<ffffffff8115a3a5>] vfs_read+0xc5/0x190 >>>>>>> [<ffffffff8115a57f>] SyS_read+0x5f/0xa0 >>>>>>> [<ffffffff814327bb>] tracesys+0xdd/0xe2 >>>>>>> Mem-Info: >>>>>>> Node 0 DMA32 per-cpu: >>>>>>> CPU 0: hi: 186, btch: 31 usd: 2 >>>>>>> CPU 1: hi: 186, btch: 31 usd: 159 >>>>>>> CPU 2: hi: 186, btch: 31 usd: 82 >>>>>>> CPU 3: hi: 186, btch: 31 usd: 182 >>>>>>> CPU 4: hi: 186, btch: 31 usd: 57 >>>>>>> CPU 5: hi: 186, btch: 31 usd: 0 >>>>>>> Node 0 Normal per-cpu: >>>>>>> CPU 0: hi: 186, btch: 31 usd: 22 >>>>>>> CPU 1: hi: 186, btch: 31 usd: 117 >>>>>>> CPU 2: hi: 186, btch: 31 usd: 16 >>>>>>> CPU 3: hi: 186, btch: 31 usd: 79 >>>>>>> CPU 4: hi: 186, btch: 31 usd: 59 >>>>>>> CPU 5: hi: 186, btch: 31 usd: 0 >>>>>>> active_anon:3461555 inactive_anon:451550 > isolated_anon:0 >>>>>>> active_file:236104 inactive_file:1899298 > isolated_file:0 >>>>>>> unevictable:467 dirty:109 writeback:0 unstable:0 >>>>>>> free:33860 slab_reclaimable:19654 > slab_unreclaimable:5110 >>>>>>> mapped:15284 shmem:2 pagetables:10231 bounce:0 >>>>>>> free_cma:0 >>>>>>> Node 0 DMA32 free:91764kB min:4824kB low:6028kB > high:7236kB >>>>>> active_anon:2185528kB inactive_anon:566976kB > active_file:108808kB >>>>>> inactive_file:670996kB unevictable:0kB isolated(anon):0kB >>> isolated(file):0kB >>>>>> present:3660960kB managed:3646160kB mlocked:0kB > dirty:152kB >>> writeback:0kB >>>>>> mapped:1344kB shmem:0kB slab_reclaimable:17300kB >>> slab_unreclaimable:580kB >>>>>> kernel_stack:80kB pagetables:6448kB unstable:0kB > bounce:0kB >>> free_cma:0kB >>>>>> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no >>>>>>> lowmem_reserve[]: 0 20607 20607 >>>>>>> Node 0 Normal free:39164kB min:27940kB low:34924kB >>> high:41908kB >>>>>> active_anon:11663332kB inactive_anon:1239224kB > active_file:835608kB >>>>>> inactive_file:6926636kB unevictable:1868kB > isolated(anon):0kB >>> isolated(file):0kB >>>>>> present:21495808kB managed:21101668kB mlocked:1868kB > dirty:612kB >>> writeback:0kB >>>>>> mapped:60120kB shmem:8kB slab_reclaimable:61536kB >>> slab_unreclaimable:19860kB >>>>>> kernel_stack:2088kB pagetables:34804kB unstable:0kB > bounce:0kB >>> free_cma:0kB >>>>>> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no >>>>>>> lowmem_reserve[]: 0 0 0 >>>>>>> Node 0 DMA32: 506*4kB (UEM) 545*8kB (UEM) 569*16kB > (UEM) >>> 897*32kB (UEM) >>>>>> 541*64kB (UEM) 87*128kB (UM) 4*256kB (UM) 2*512kB (M) > 0*1024kB >>> 0*2048kB 0*4096kB >>>>>> = 92000kB >>>>>>> Node 0 Normal: 703*4kB (UE) 285*8kB (UEM) 176*16kB > (UEM) >>> 836*32kB (UEM) >>>>>> 7*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB > 0*4096kB = >>> 35108kB >>>>>>> Node 0 hugepages_total=0 hugepages_free=0 > hugepages_surp=0 >>>>>> hugepages_size=2048kB >>>>>>> 2137331 total pagecache pages >>>>>>> 1215 pages in swap cache >>>>>>> Swap cache stats: add 7056879, delete 7055664, find >>> 633567/703482 >>>>>>> Free swap = 8301172kB >>>>>>> Total swap = 8388604kB >>>>>>> 6291455 pages RAM >>>>>>> 102580 pages reserved >>>>>>> 4391935 pages shared >>>>>>> 4746779 pages non-shared >>>>>>> >>>>>>> -- >>>>>>> Linux-cachefs mailing list >>>>>>> Linux-cachefs@xxxxxxxxxx >>>>>>> https://www.redhat.com/mailman/listinfo/linux-cachefs >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Milosz Tanski >>>>>> CTO >>>>>> 16 East 34th Street, 15th floor >>>>>> New York, NY 10016 >>>>>> >>>>>> p: 646-253-9055 > >>> >>>>>> >>>>>> e: milosz@xxxxxxxxx >>>>>> >>>> >>>> >>>> >>>> -- >>>> Milosz Tanski >>>> CTO >>>> 16 East 34th Street, 15th floor >>>> New York, NY 10016 >>>> >>>> p: 646-253-9055 >>>> e: milosz@xxxxxxxxx >>> >>> >>> >>> -- >>> Milosz Tanski >>> CTO >>> 16 East 34th Street, 15th floor >>> New York, NY 10016 >>> >>> p: 646-253-9055 >>> e: milosz@xxxxxxxxx >>> > > > > -- > Milosz Tanski > CTO > 16 East 34th Street, 15th floor > New York, NY 10016 > > p: 646-253-9055 > e: milosz@xxxxxxxxx > -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs