On Fri, Aug 29, 2014 at 12:31 PM, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: > On Fri, 29 Aug 2014, Trond Myklebust wrote: > >> On Fri, Aug 29, 2014 at 11:34 AM, Benjamin Coddington >> <bcodding@xxxxxxxxxx> wrote: >>> >>> >>> On Thu, 21 Aug 2014, Trond Myklebust wrote: >>>> >>>> >>>> On Thu, Aug 21, 2014 at 2:15 PM, David Jeffery <djeffery@xxxxxxxxxx> >>>> wrote: >>>>> >>>>> >>>>> On 08/21/2014 11:50 AM, Trond Myklebust wrote: >>>>>> >>>>>> >>>>>> On Thu, Aug 21, 2014 at 11:40 AM, David Jeffery <djeffery@xxxxxxxxxx> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> On 08/20/2014 08:28 PM, Trond Myklebust wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> What guarantees that this does not lead to silent corruption of the >>>>>>>> file >>>>>>>> if there are outstanding write requests? >>>>>>>> >>>>>>> >>>>>>> Do you have a particular scenario in mind you are concerned about? >>>>>>> >>>>>>> Right before the code the patch modifies, nfs_sync_mapping() is >>>>>>> called. >>>>>>> Any writes started before the unlock operation began have already >>>>>>> been >>>>>>> flushed, so we shouldn't have a corruption case of writes from before >>>>>>> the unlock began being sent after the unlock is complete. >>>>>>> >>>>>>> Are you concerned about some other nfs4 writes being started while we >>>>>>> initially waited on the counter? Such a write racing with the unlock >>>>>> >>>>>> >>>>>> >>>>>> No. I'm worried about the writes that have been started, but which are >>>>>> now completing in the background while the lock is being freed. >>>>>> >>>>>>> going ahead instead of erroring out could initially fail from a wrong >>>>>>> state ID, but should retry with the new state. Is there something >>>>>>> I've >>>>>>> overlooked? >>>>>> >>>>>> >>>>>> >>>>>> Loss of lock atomicity due to the fact that the writes are completing >>>>>> after the lock was released. >>>>>> >>>>> >>>>> I don't think my patches break lock atomicity, unless I've completely >>>>> misunderstood nfs_sync_mapping()'s behavior. >>>>> >>>>> The background writes should have already been waited for by >>>>> nfs_sync_mapping() at the beginning of do_unlk(). nfs_sync_mapping() >>>>> would call nfs_wb_all() which calls sync_inode() with WB_SYNC_ALL, so >>>>> it's going to push out any dirty data and wait for any writes to >>>>> complete. Only after the writes are complete do we go on to call >>>>> nfs_iocounter_wait(). >>>> >>>> >>>> >>>> What if nfs_wb_all() is interrupted? Currently, that doesn't matter >>>> precisely because of the call to nfs_iocounter_wait. >>> >>> >>> >>> Hi Trond, it looks like the intent of: >>> >>> 577b423 NFS: Add functionality to allow waiting on all outstanding >>> reads.. >>> 7a8203d NFS: Ensure that NFS file unlock waits for readahead to >>> complete.. >>> >>> was to wait for readahead, not to avoid releasing the lock if >>> nfs_wb_all() >>> is interrupted. What was the underlying reason for waiting for readahead >>> to >>> complete? I know there was one issue where the NFS server would release >>> the >>> lockowner before processing the reads - was that what these were trying >>> to >>> avoid? I ask because if we're only worried about outstanding writes we >>> could just wait on the writes, not the reads. >>> >>> Would it be better to use TASK_UNINTERRUPTIBLE instead of TASK_KILLABLE >>> in >>> __nfs_iocounter_wait? >>> >>> Ben >> >> >> Hi Ben, >> >> The main intent of the two patches was to ensure that all I/O that >> uses the lock stateid should have completed before we attempt to >> release the lock. The main reason is to avoid an unnecessary >> "recovery" situation where the server asks us to resend the I/O >> because the lock stateid is no longer valid. >> >> Concerning writes; if there are any outstanding then we must ensure >> those complete before we unlock the file. This rule should be >> unaffected by the above 2 patches (although it is just a bonus if they >> help to enforce it). > > > If so, then David's patch shouldn't risk corruption any more than before > 577b423 and 7a8203d were applied. > > He's trying to avoid the problem (I think) where very aggressive readahead > behavior blocks the unlocking thread and a sysadmin decides to kill the > process, which results in the abandoned lock. That can make a big mess. > Instead, his change would result in the "recovery" scenario at worst when > the impatient sysadmin goes on a rampage. > The patch is trying to force the lock to be cleared before the I/O is done. That's wrong, whether or not the code was broken before. I'm not applying. Cheers Trond -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html