Re: [PATCH RFC 00/30] Overhaul NFSD filecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jun 23, 2022, at 1:51 PM, Wang Yugui <wangyugui@xxxxxxxxxxxx> wrote:
> 
> Hi,
> 
>>> On Jun 23, 2022, at 5:02 AM, Wang Yugui <wangyugui@xxxxxxxxxxxx> wrote:
>>> 
>>> Hi,
>>> 
>>>>> On Jun 22, 2022, at 3:04 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>>>>>> On Jun 22, 2022, at 2:36 PM, Wang Yugui <wangyugui@xxxxxxxxxxxx> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> fstests generic/531 triggered a panic on kernel 5.19.0-rc3 with this
>>>>>> patchset.
>>>>> 
>>>>> As I mention in the cover letter, I haven't tried running generic/531
>>>>> yet -- no claim at all that this is finished work and that #386 has
>>>>> been fixed at this point. I'm merely interested in comments on the
>>>>> general approach.
>>>>> 
>>>>> 
>>>>>> [ 405.478056] BUG: kernel NULL pointer dereference, address: 0000000000000049
>>>>> 
>>>>> The "RIP: " tells the location of the crash. Notice that the call
>>>>> trace here does not include that information. From your attachment:
>>>>> 
>>>>> [ 405.518022] RIP: 0010:nfsd_do_file_acquire+0x4e1/0xb80 [nfsd]
>>>>> 
>>>>> To match that to a line of source code:
>>>>> 
>>>>> [cel@manet ~]$ cd src/linux/linux/
>>>>> [cel@manet linux]$ scripts/faddr2line ../obj/manet/fs/nfsd/filecache.o nfsd_do_file_acquire+0x4e1
>>>>> nfsd_do_file_acquire+0x4e1/0xfc0:
>>>>> rht_bucket_insert at /home/cel/src/linux/linux/include/linux/rhashtable.h:303
>>>>> (inlined by) __rhashtable_insert_fast at /home/cel/src/linux/linux/include/linux/rhashtable.h:718
>>>>> (inlined by) rhashtable_lookup_get_insert_key at /home/cel/src/linux/linux/include/linux/rhashtable.h:982
>>>>> (inlined by) nfsd_file_insert at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1031
>>>>> (inlined by) nfsd_do_file_acquire at /home/cel/src/linux/linux/fs/nfsd/filecache.c:1089
>>>>> [cel@manet linux]$
>>>>> 
>>>>> This is an example, I'm sure my compiled objects don't match yours.
>>>>> 
>>>>> And, now that I've added observability, you should be able to do:
>>>>> 
>>>>> # watch cat /proc/fs/nfsd/filecache
>>>>> 
>>>>> to see how many items are in the hash and LRU list while the test
>>>>> is running.
>>>>> 
>>>>> 
>>>>>> [ 405.608016] Call Trace:
>>>>>> [ 405.608016] <TASK>
>>>>>> [ 405.613020] nfs4_get_vfs_file+0x325/0x410 [nfsd]
>>>>>> [ 405.618018] nfsd4_process_open2+0x4ba/0x16d0 [nfsd]
>>>>>> [ 405.623016] ? inode_get_bytes+0x38/0x40
>>>>>> [ 405.623016] ? nfsd_permission+0x97/0xf0 [nfsd]
>>>>>> [ 405.628022] ? fh_verify+0x1cc/0x6f0 [nfsd]
>>>>>> [ 405.633025] nfsd4_open+0x640/0xb30 [nfsd]
>>>>>> [ 405.638025] nfsd4_proc_compound+0x3bd/0x710 [nfsd]
>>>>>> [ 405.643017] nfsd_dispatch+0x143/0x270 [nfsd]
>>>>>> [ 405.648019] svc_process_common+0x3bf/0x5b0 [sunrpc]
>>>> 
>>>> I was able to trigger something that looks very much like this crash.
>>>> If you remove this line from fs/nfsd/filecache.c:
>>>> 
>>>> 	.max_size		= 131072, /* buckets */
>>>> 
>>>> things get a lot more stable for generic/531.
>>>> 
>>>> I'm looking into the issue now.
>>> 
>>> Yes. When '.max_size = 131072' is removed, fstests generic/531 passed.
>> 
>> Great! Are you comfortable with this general approach for bug #386?
> 
> It seems a good result for #386.
> 
> fstests generic/531(file-max: 1M) performance result:
> base(5.19.0-rc3, 12 bits hash, serialized nfsd_file_gc): 222s
> this patchset(.min_size=4096): 59s
> so, a good improvement for #386.
> 
> It seems a good(acceptable) result for #387 too.
> the period of 'text busy(exec directly from the back-end of nfs-server)'
> is about 4s.

I was surprised to learn that NFSv4 doesn't close those files outright,
that they remain in the filecache for a bit. I think we can do better
there, but I haven't looked closely at that yet.

I see that files are left open on the server by crashed NFSv4 clients too.
That will result in the server becoming unusable after significant
uptime, which I've seen on occasion (because I crash my clients a lot)
but never looked into until now. 


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux