On 2011-09-09 11:20, Trond Myklebust wrote: > On Thu, 2011-09-08 at 23:11 -0400, tao.peng@xxxxxxx wrote: >> HI, Trond, >> >>> -----Original Message----- >>> From: Myklebust, Trond [mailto:Trond.Myklebust@xxxxxxxxxx] >>> Sent: Friday, September 09, 2011 1:05 AM >>> To: Peng Tao >>> Cc: Peng, Tao; gusev.vitaliy@xxxxxxxxxxx; gusev.vitaliy@xxxxxxxxx; >>> linux-nfs@xxxxxxxxxxxxxxx >>> Subject: RE: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release >>> >>>> -----Original Message----- >>>> From: Peng Tao [mailto:bergwolf@xxxxxxxxx] >>>> Sent: Thursday, September 08, 2011 11:00 AM >>>> To: Myklebust, Trond >>>> Cc: tao.peng@xxxxxxx; gusev.vitaliy@xxxxxxxxxxx; >>>> gusev.vitaliy@xxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx >>>> Subject: Re: [PATCH] nfs: fix inifinite loop at >>>> nfs4_layoutcommit_release >>>> >>>> On Thu, Sep 8, 2011 at 8:01 PM, Myklebust, Trond >>>> <Trond.Myklebust@xxxxxxxxxx> wrote: >>>>>> -----Original Message----- >>>> >>>>> >>>>> Yes, but as far as I can see, even in the blocks case there can be >>>> multiple extents per layout segment. What if I write to one >>>> uninitialised extent, layoutcommit, then write to another uninitialized >>>> extent in the same layout segment and layoutcommit? In my reading of >>>> the code, there is a chance that the second layoutcommit will fail to >>>> pick up the layout segment, and so will fail to notify the MDS that the >>>> second extent now contains data. >>>> >>>> blocklayout does not decide what to layoutcommit according to the lseg >>>> list. Instead, it keeps track of each extent's state at the >>>> granularity of blocksize, and encode whatever needs layoutcommitted in >>>> the layoutcommit call. So in your above case, as long as the second >>>> layoutcommit is issued, blocklayout will encode the newly written >>>> extent in the second layoutcommit call, even if the lseg is not >>>> attached to the second layoutcommit. >>>> >>>> But that leads to another two question: >>>> 1. How do we protect against layoutrecall if lseg is not linked to >>>> layoutcommit? For this one, can we just reject layoutrecall if there >>>> is inflight layoutcommit? It will be less parallel but can guarantee >>>> current implementation correctness. >>>> 2. blocklayout ONLY: bl_committing may be overloaded by several >>>> layoutcommit calls and we don't have information in >>>> cleanup_layoutcommit() on how many entry should be removed from >>>> bl_committing. Maybe we can add a (void*) to struct >>>> nfs4_layoutcommit_data, so that LD can pass some private information >>>> between encode_layoutcommit() and cleanup_layoutcommit()? >>> >>> 3. What is the purpose of pinning the layout segment at all if neither blocks, nor >>> objects nor files cares? >> I believe it is for protecting against layoutrecall. But since we are seperating lseg and LD specific layout information management, it is actually not working as expected. >> The layout segments are not really in use while in LAYOUTCOMMIT. We only need to get the stateid right with respect to concurrent layout recalls. >>> IOW: if both objects and blocks track the information that they need for >>> layoutcommit outside the layout segments, then why do we need the >>> NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list at all? >> If we remove NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list, we must find other method to protect agains freeing lseg while layoutcommit is needed or is going on. e.g., check for NFS_INO_LAYOUTCOMMIT bit and inflight layoutcommit in initiate_file_draining(). > > The easiest solution is to ensure we have only _one_ LAYOUTCOMMIT on the > wire at a time. Otherwise, you are looking at a many-to-many mapping > between layoutcommits and lsegs. > > We should not expect to need multiple layoutcommits on the wire if pNFS > is working smoothly (i.e. no layout recalls), so optimising for that > case is wrong. > I'd also think that we want to order layoutcommit and ordinary commits > (for those NFS files servers that require both) so that we don't end up > writing a file size change to disk before the actual data is committed. > > So why not just protect the layoutcommit call using the existing > nfs_commit_set_lock/nfs_commit_clear_lock? > Sounds good to me, Benny -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html