Re: RFC: fixing kernel oops on interrupted COMMIT from nfs_commit_file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 14, 2017 at 5:17 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> On Fri, Apr 14, 2017 at 3:16 PM, Trond Myklebust
> <trondmy@xxxxxxxxxxxxxxx> wrote:
>> On Fri, 2017-04-14 at 11:56 -0400, Olga Kornievskaia wrote:
>>> On Thu, Apr 13, 2017 at 4:50 PM, Trond Myklebust
>>> <trondmy@xxxxxxxxxxxxxxx> wrote:
>>> >
>>> > Secondly, if you want to release the request and you are not sure
>>> > whether or not it got cleared off the inode's cinfo commit list
>>> > yet,
>>> > you may also need to lock that request and call
>>> > nfs_clear_request_commit().
>>>
>>> Looking at what the function does, I don't see why this is needed.
>>> "wb_page" is NULL for this type of commit and there is no pnfs in
>>> this
>>> case.
>>
>> Something needs to ensure that the request is not sitting on a commit
>> list. That can happen if the commit succeeded, but returned a different
>> verifier, or it can happen if nfs_scan_commit() exits early.
>
> First point I'd like you to consider is that your last statement as a
> separate problem that needs to be solved with a different patch.
>
> Secondly, I don't understand how to determine "if nfs_commit_file()
> isn't sure whether or not request got cleared off the inode's cinfo
> commit list". There are two cases you pointed to 1) verifier mismatch
> and 2) nfs_scan_commit didn't even get to the request (caz INT_MAX
> commits were queued before that)?
>
> For the verifier mismatch: "NFS_CONTEXT_RESEND_WRITES" is set. So
> nfs_commit_file() can check for that. But I guess I'm not sure what is
> suppose to be done? Resend the commit or resend the copy?
>
> I don't really understand how to handle #2. nfs_commit_inode() return
> from doing INT_MAX commits but not doing the commit that it was
> suppose to do nfs_commit_file() asked and I'm at a loss how to
> determine that. Shouldn't there have been a loop somewhere that
> handles *all* commit requests and not just INT_MAX requests. Is this
> problem not a problem from the nfs_commit_file() but rather a generic
> problem?

Taking a different approach to fixing it by getting rid of
nfs_commit_file() and for the synchronous COPY appending COMMIT to the
compound. For the asynchronous COPY, I propose to just add a function
nfs4_proc_copy() that uses existing callback functions and just sends
the commit without messing with the commit path.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux