Still having the same issues here on Gentoo 2.6.33 - have been having the exact same issue since 2.6.30 bugs were squashed. Runs fine for upto 3-4hrs, then under load it just dies. Greg -----Original Message----- From: linux-cachefs-bounces@xxxxxxxxxx [mailto:linux-cachefs-bounces@xxxxxxxxxx] On Behalf Of Romain DEGEZ Sent: Tuesday, 30 March 2010 12:34 AM To: linux-cachefs@xxxxxxxxxx Subject: cachefiles bug Dear David, First of all, thanks for your work. It looks very promising as we were missing such a nice functionality in the kernel for so long! In a production setup 4 servers with 16Gig of ram and dual quad-core xeon L5410 processors, running a 2.6.33-2-amd64 debian kernel. These servers are used to send files over http (using apache or lighttpd). These files are all located on a remote nfs server and localy-cached thanks to fs-cache and cachefilesd on a local 2 disk raid1 array with a 250gig ext4 filesystem mounted in /var/cache/fscache. The nfs filesystem is mounted that way: x.x.x.x:/data on /data type nfs (ro,noatime,tcp,soft,fsc,addr=x.x.x.x) cachefilesd.conf is : dir /var/cache/fscache tag mycache brun 10% bcull 7% bstop 3% frun 10% fcull 7% fstop 3% #cat /proc/fs/fscache/stats FS-Cache statistics Cookies: idx=3 dat=2880 spc=0 Objects: alc=2484 nal=0 avl=2484 ded=2462 ChkAux : non=0 ok=2131 upd=0 obs=70 Pages : mrk=15802814 unc=14993041 Acquire: n=2883 nul=0 noc=252 ok=2631 nbf=252 oom=0 Lookups: n=2484 neg=343 pos=2141 crt=0 tmo=343 Updates: n=0 nul=0 run=0 Relinqs: n=1721 nul=0 wcr=0 rtr=20 AttrChg: n=0 ok=0 nbf=0 oom=0 run=0 Allocs : n=0 ok=0 wt=0 nbf=0 int=0 Allocs : ops=0 owt=0 abt=0 Retrvls: n=14741 ok=5400 wt=452 nod=693 nbf=8648 int=0 oom=0 Retrvls: ops=6093 owt=112 abt=0 Stores : n=1972991 ok=1972776 agn=0 nbf=215 oom=0 Stores : ops=999 run=1965351 pgs=1964352 rxd=1972776 olm=0 VmScan : nos=14959114 gon=0 bsy=10 can=8424 Ops : pend=112 run=7092 enq=16438335 can=0 rej=0 Ops : dfr=0 rel=7092 gc=0 CacheOp: alo=0 luo=0 luc=0 gro=0 CacheOp: upo=0 dro=0 pto=0 atc=0 syn=0 CacheOp: rap=0 ras=0 alp=0 als=0 wrp=0 ucp=0 dsp=0 And we are seeing a lot of these errors in on all our servers dmesg: [ 4868.465413] CacheFiles: I/O Error: Unlink failed [ 4868.465444] FS-Cache: Cache cachefiles stopped due to I/O error [ 4947.320011] CacheFiles: File cache on md3 unregistering [ 4947.320041] FS-Cache: Withdrawing cache "mycache" [ 5127.348683] FS-Cache: Cache "mycache" added (type cachefiles) [ 5127.348716] CacheFiles: File cache on md3 registered [ 7076.871081] CacheFiles: I/O Error: Unlink failed [ 7076.871130] FS-Cache: Cache cachefiles stopped due to I/O error [ 7116.780891] CacheFiles: File cache on md3 unregistering [ 7116.780937] FS-Cache: Withdrawing cache "mycache" [ 7296.813394] FS-Cache: Cache "mycache" added (type cachefiles) [ 7296.813432] CacheFiles: File cache on md3 registered It is very painfull as it render the cache useless .... When looking at the source-code, the cause of the "I/O Error: Unlink failed" which seems to happen somewhere after the "bury_something" function is called looked pretty obscure to me... I don't see why any unlink would fail.... I am monitoring this list for some time and tried all the various patches without success... Could you please give me a hand to troubleshot this issue ? Regards, -- RD -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs