Re: Is this nfsd kernel oops known?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 10, 2022, at 5:14 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> 
> On Wed, Sep 07, 2022 at 08:52:46AM -0400, Benjamin Coddington wrote:
>> On 7 Sep 2022, at 0:58, Chuck Lever III wrote:
>> 
>>>> On Sep 6, 2022, at 3:12 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>>>> 
>>>> On Tue, Sep 6, 2022 at 2:28 PM Benjamin Coddington
>>>> <bcodding@xxxxxxxxxx> wrote:
>>>>> 
>>>>> On 1 Sep 2022, at 21:27, Olga Kornievskaia wrote:
>>>>> 
>>>>>> Thanks Chuck. I first, based on a hunch, narrowed down that it's
>>>>>> coming from Al Viro's merge commit. Then I git bisected his
>>>>>> 32patches
>>>>>> to the following commit f0f6b614f83dbae99d283b7b12ab5dd2e04df979
>>>>> 
>>>>> No crash for me after reverting
>>>>> f0f6b614f83dbae99d283b7b12ab5dd2e04df979.
>>>> 
>>>> I second that. No crash after a revert here.
>>> 
>>> I bisected the new xfstests failures to the same commit:
>>> 
>>> f0f6b614f83dbae99d283b7b12ab5dd2e04df979 is the first bad commit
>>> commit f0f6b614f83dbae99d283b7b12ab5dd2e04df979
>>> Author: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
>>> Date:   Thu Jun 23 17:21:37 2022 -0400
>>> 
>>>    copy_page_to_iter(): don't split high-order page in case of
>>> ITER_PIPE
>>> 
>>>    ... just shove it into one pipe_buffer.
>>> 
>>>    Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
>>> 
>>> lib/iov_iter.c | 21 ++++++---------------
>>> 1 file changed, 6 insertions(+), 15 deletions(-)
>>> 
>> 
>> I've been reliably reproducing this on generic/551 on xfs.  In the case
>> where it crashes, rqstp->rq_res.page_base is positive multiple of PAGE_SIZE
>> after getting set in nfsd_splice_actor(), and that with page_len overruns
>> the 256 pages we have.
>> 
>> With f0f6b614f83d reverted, rqstp->rq_res.page_base is always zero.
>> 
>> After 47b7fcae419dc and f0f6b614f83d, buf->offset in nfsd_splice_actor can
>> be a positive multiple of PAGE_SIZE, whereas before it was always just the
>> offset into the page.
>> 
>> Something like this might fix it up:
>> 
>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>> index 9f486b788ed0..d62963f36f03 100644
>> --- a/fs/nfsd/vfs.c
>> +++ b/fs/nfsd/vfs.c
>> @@ -849,7 +849,7 @@ nfsd_splice_actor(struct pipe_inode_info *pipe, struct
>> pipe_buffer *buf,
>> 
>>        svc_rqst_replace_page(rqstp, buf->page);
>>        if (rqstp->rq_res.page_len == 0)
>> -               rqstp->rq_res.page_base = buf->offset;
>> +               rqstp->rq_res.page_base = buf->offset % PAGE_SIZE;
>>        rqstp->rq_res.page_len += sd->len;
>>        return sd->len;
>> }
>> 
>> .. but we should check with Al about whether this needs to be fixed over in
>> copy_page_to_iter_pipe(),  since I don't think pipe_buffer offset should be
>> greater than PAGE_SIZE.
>> 
>> Al, any thoughts?
> 
> pipe_buffer offsets in general can be greater than PAGE_SIZE.  What's more,
> buffer *size* can be greater than PAGE_SIZE - it really can contain more
> than PAGE_SIZE worth of data.  In that case the page is a compound one, of
> course.
> 
> Regression is the combination of "splice from regular file to pipe hadn't
> produced that earlier, now it does" and "nfsd never needed to handle that".
> 
> I don't believe that fix is correct.  *IF* you can't deal with compound
> page in sunrpc, you need a loop going by subpages in nfsd_splice_actor(),
> similar to one that used to be in copy_page_to_iter().  Could you try
> the following:
> 
> nfsd_splice_actor(): handle compound pages
> 
> pipe_buffer might refer to a compound page (and contain more than a PAGE_SIZE
> worth of data).  Theoretically it had been possible since way back, but
> nfsd_splice_actor() hadn't run into that until copy_page_to_iter() change.
> Fortunately, the only thing that changes for compound pages is that we
> need to stuff each relevant subpage in and convert the offset into offset
> in the first subpage.
> 
> Hopefully-fixes: f0f6b614f83d "copy_page_to_iter(): don't split high-order page in case of ITER_PIPE"
> Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> ---

I'm getting my head around this, just a couple of comments so far.


> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 9f486b788ed0..b16aed158ba6 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -846,10 +846,14 @@ nfsd_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
> 		  struct splice_desc *sd)
> {
> 	struct svc_rqst *rqstp = sd->u.data;
> -
> -	svc_rqst_replace_page(rqstp, buf->page);
> -	if (rqstp->rq_res.page_len == 0)
> -		rqstp->rq_res.page_base = buf->offset;
> +	struct page *page = buf->page;	// may be a compound one
> +	unsigned offset = buf->offset;
> +
> +	page += offset / PAGE_SIZE;

Nit: I see "offset / PAGE_SIZE" is used in the iter code base,
but in the NFS stack, we prefer "offset >> PAGE_SIZE" and
"offset & ~PAGE_MASK" (below).


> +	for (int i = sd->len; i > 0; i -= PAGE_SIZE)
> +		svc_rqst_replace_page(rqstp, page++);
> +	if (rqstp->rq_res.page_len == 0)	// first call
> +		rqstp->rq_res.page_base = offset % PAGE_SIZE;
> 	rqstp->rq_res.page_len += sd->len;
> 	return sd->len;
> }

I could take this through the nfsd for-rc tree, but that's based
on 5.19-rc7 so it doesn't have f0f6b614f83d. I don't think will
break functionality, but I'm wondering if it would be better for
you to take this through your tree to improve bisect-ability.

If you agree and Ben reports a Tested-by:, then here's my

  Acked-by: Chuck Lever <chuck.lever@xxxxxxxxxx>


It might be nice one day to have a single mechanism for NFSD
to handle READs. I don't think a pipe is going to work for our
upcoming hole-detection scheme, for example.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux