Re: ceph_read_iter NULL pointer dereference

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 05 2024, Xiubo Li wrote:

> Hi Luis,
>
> Thanks for your reporting, BTW, could this be reproduceable ?
>
> This is also the first time I see this crash BUG.
>
>
> The 'i_size == 0' could be easy to reproduce, please see my following debug
> logs:
>
> ++++++++++++++++++++++++++++
>
>  ceph_read_iter: 0~1024 trying to get caps on 000000006a438277
> 100000001f7.fffffffffffffffe
>  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe need Fr want Fc
>  __ceph_caps_issued: 000000006a438277 100000001f7.fffffffffffffffe cap
> 000000001a8b6d16 issued pAsLsXsFrw
>  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe have pAsLsXsFrw
> but not Fc (revoking -)
>  try_get_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe ret 1 got Fr
>  ceph_read_iter: sync 000000006a438277 100000001f7.fffffffffffffffe 0~1024 got
> cap refs on Fr
>  ceph_sync_read: on file 00000000e029b65e 0~400
>  __ceph_sync_read: on inode 000000006a438277 100000001f7.fffffffffffffffe 0~400
>  __ceph_sync_read: orig 0~1024 reading 0~1024
>  __ceph_sync_read: 0~1024 got -2 i_size 0
>  __ceph_sync_read: result 0 retry_op 0
>  ceph_read_iter: 000000006a438277 100000001f7.fffffffffffffffe dropping cap refs
> on Fr = 0
>  __ceph_put_cap_refs: 000000006a438277 100000001f7.fffffffffffffffe had Fr last
>  __ceph_caps_issued: 000000006a438277 100000001f7.fffffffffffffffe cap
> 000000001a8b6d16 issued pAsLsXsFrw
> +++++++++++++++++++++++++++++++++
>
> I just created one empty file and then in Client.A open it for r/w, and then
> open the same file in Client.B and did a simple read.
>
> Currently ceph kclient won't check the 'i_size' before sending out the sync read
> request to Rados, but will do it after it getting the contents back, As I
> remembered this logic comply to the "MIX" filelock state in MDS:
>
> [LOCK_MIX]       = { 0,         false, LOCK_MIX,  0,    0,   REQ, ANY, 0,   0,  
> 0, CEPH_CAP_GRD|CEPH_CAP_GWR|CEPH_CAP_GLAZYIO,0,0,CEPH_CAP_GRD },
>
> You can raise one ceph tracker for this.

I'll do that, and thanks for analysis.  I'll need to catch-up with a few
things first after being a week offline, but I'll get back to this bug
shortly.

Cheers,
-- 
Luís


>
> Thanks
>
> - Xiubo
>
> On 8/3/24 00:39, Luis Henriques wrote:
>> Hi Xiubo,
>>
>> I was wondering if you ever seen the BUG below.  I've debugged it a bit
>> and the issue seems occurs here, while doing the SetPageUptodate():
>>
>> 		if (ret <= 0)
>> 			left = 0;
>> 		else if (off + ret > i_size)
>> 			left = i_size - off;
>> 		else
>> 			left = ret;
>> 		while (left > 0) {
>> 			size_t plen, copied;
>>
>> 			plen = min_t(size_t, left, PAGE_SIZE - page_off);
>> 			SetPageUptodate(pages[idx]);
>> 			copied = copy_page_to_iter(pages[idx++],
>> 						   page_off, plen, to);
>> 			off += copied;
>> 			left -= copied;
>> 			page_off = 0;
>> 			if (copied < plen) {
>> 				ret = -EFAULT;
>> 				break;
>> 			}
>> 		}
>>
>> So, the issue is that we have idx > num_pages.  And I'm almost sure that's
>> because of i_size being '0' and 'left' ending up with a huge value.  But
>> haven't managed to figure out yet why i_size is '0'.
>>
>> (Note: I'll be offline next week, but I'll continue looking into this the
>> week after.  But I figured I should report the bug anyway, in case you've
>> seen something similar.)
>>
>> Cheers,
>





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux