Re: Another Bug with cachefs during underlaying nfs server timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 04 May 2010 19:11:29 Romain DEGEZ wrote:
> On Tuesday 04 May 2010 19:03:59 David Howells wrote:
> > Romain DEGEZ <romain.degez@xxxxxxxxxxxx> wrote:
> > > [1365947.770498] page:ffffea000cb91cf8 flags:0200000000001008 count:0
> > > mapcount:0 mapping:(null) index:7052
> 
> Hi David,
> 
> > I *think* that means that PG_private_2 (PG_fscache) is still set on the
> >  page (the '1' three places from the end of the flags being bit 12).
> >
> > Now this looks very odd...  Ext4 doesn't use fscache yet (as far as I
> >  know), so this bit shouldn't be set in Ext4 pages.  The pages can't have
> >  had this bit set when they were allocated, because
> >  get_page_from_freelist() checks for that.
> >
> > I'm not sure what's going on here.  PG_private_2 should only be set on
> >  pages that NFS hands to fscache to read into.  At the point they're set,
> >  NFS should still own them:-/
> >
> > Are you using anything other than NFS that could be doing caching?  AFS
> > of 9PFS for example?
> 
> Nothing like this : We are using NFSv3 with fscache and ext4 backend.
>  Nothing special.
> 
> Here is an output of my /proc/mounts:
> 
> /dev/md3 /var/cache/fscache ext4
> rw,noatime,nodiratime,user_xattr,barrier=0,journal_async_commit,nobh,data=w
> riteback 0 0
> storage-prod:/data/ondemand /data/ondemand nfs
> ro,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,tim
> eo=600,retrans=2,sec=sys,mountaddr=10.10.30.209,mountvers=3,mountport=35795
> ,mountproto=tcp,fsc,addr=10.10.30.209 0 0
> 
> ( /var/cache/fscache is the root cache dir configured in
>  /etc/cachefilesd.conf )
> 
> Regards,
> 

I'm seeing a lot of these backtrace for quite some times on all our server 
(before the last week crash and also now after the reboot)

[370490.479780] [kslowd] preemptive burial: OBJ7a41 [OBJECT_RECYCLING] 
ffff880428e9c540                                     
[370517.586679] BUG: Bad page state in process lighttpd  pfn:186deb                                                         
[370517.586712] page:ffffea0005580b68 flags:0200000000001000 count:0 mapcount:0 
mapping:(null) index:4367                   
[370517.586760] Pid: 23754, comm: lighttpd Tainted: G    B      2.6.33-2-amd64 
#1                                           
[370517.586804] Call Trace:                                                                                                 
[370517.586832]  [<ffffffff810b1e09>] ? bad_page+0x116/0x129                                                                
[370517.586861]  [<ffffffff810b3d8a>] ? get_page_from_freelist+0x4da/0x732                                                  
[370517.586891]  [<ffffffff810b43fb>] ? __alloc_pages_nodemask+0x10f/0x5e0                                                  
[370517.586922]  [<ffffffff8110e03b>] ? mpage_bio_submit+0x22/0x26                                                          
[370517.586950]  [<ffffffff8110eaaf>] ? mpage_readpage+0x68/0x72                                                            
[370517.586978]  [<ffffffff810e1c63>] ? __kmalloc+0x12f/0x141                                                               
[370517.587009]  [<ffffffffa03b9b65>] ? cachefiles_read_or_alloc_pages+0x3b4/0x7e5 
[cachefiles]                             
[370517.587066]  [<ffffffffa038dc33>] ? 
nfs_readpage_from_fscache_complete+0x0/0x5c [nfs]                                   
[370517.587114]  [<ffffffffa034070a>] ? __fscache_read_or_alloc_pages+0x1e4/0x260 
[fscache]                                 
[370517.587174]  [<ffffffffa038db69>] ? __nfs_readpages_from_fscache+0x77/0x141 
[nfs]                                       
[370517.587236]  [<ffffffffa0375f8d>] ? nfs_readpages+0xf4/0x18d [nfs]                                                      
[370517.594039]  [<ffffffff810b5d19>] ? __do_page_cache_readahead+0x11b/0x1b4                                               
[370517.594070]  [<ffffffff810b5dce>] ? ra_submit+0x1c/0x20                                                                 
[370517.594100]  [<ffffffff811057c2>] ? __generic_file_splice_read+0xfa/0x3ee                                               
[370517.594131]  [<ffffffff812770d8>] ? tcp_current_mss+0x3f/0x5a                                                           
[370517.594161]  [<ffffffff81233175>] ? release_sock+0x5b/0x96                                                              
[370517.594190]  [<ffffffff8126f8a4>] ? tcp_sendpage+0x44b/0x45d                                                            
[370517.594218]  [<ffffffff81103ab0>] ? pipe_to_sendpage+0x0/0x74                                                           
[370517.594246]  [<ffffffff8122ec69>] ? kernel_sendpage+0x16/0x1f                                                           
[370517.594274]  [<ffffffff8122eca7>] ? sock_sendpage+0x35/0x39                                                             
[370517.594306]  [<ffffffff81104182>] ? spd_release_page+0x0/0xe                                                            
[370517.594342]  [<ffffffff81105af0>] ? generic_file_splice_read+0x3a/0x62                                                  
[370517.594371]  [<ffffffff811043d3>] ? splice_direct_to_actor+0xbe/0x188                                                   
[370517.594401]  [<ffffffff81104a1c>] ? direct_splice_actor+0x0/0x1e                                                        
[370517.594440]  [<ffffffff81113274>] ? ep_scan_ready_list+0x132/0x151                                                      
[370517.594468]  [<ffffffff811044e7>] ? do_splice_direct+0x4a/0x64                                                          
[370517.594498]  [<ffffffff810e8fa8>] ? do_sendfile+0x12d/0x1a8                                                             
[370517.594526]  [<ffffffff810e906c>] ? sys_sendfile64+0x49/0x88                                                            
[370517.594556]  [<ffffffff8103145f>] ? sysenter_dispatch+0x7/0x2e                                                          


Doesn't seem to be harmful for our service (some rare HTTP GET may fail in 
lighttpd occasionaly but that's not a big deal) but ... no very clean in the 
kernel log :-)

-- 
RD

--
Linux-cachefs mailing list
Linux-cachefs@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cachefs

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]
  Powered by Linux