Hi, Trond, On Sat, Sep 10, 2011 at 2:20 AM, Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> wrote: > On Thu, 2011-09-08 at 23:11 -0400, tao.peng@xxxxxxx wrote: >> HI, Trond, >> >> > -----Original Message----- >> > From: Myklebust, Trond [mailto:Trond.Myklebust@xxxxxxxxxx] >> > Sent: Friday, September 09, 2011 1:05 AM >> > To: Peng Tao >> > Cc: Peng, Tao; gusev.vitaliy@xxxxxxxxxxx; gusev.vitaliy@xxxxxxxxx; >> > linux-nfs@xxxxxxxxxxxxxxx >> > Subject: RE: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release >> > >> > > -----Original Message----- >> > > From: Peng Tao [mailto:bergwolf@xxxxxxxxx] >> > > Sent: Thursday, September 08, 2011 11:00 AM >> > > To: Myklebust, Trond >> > > Cc: tao.peng@xxxxxxx; gusev.vitaliy@xxxxxxxxxxx; >> > > gusev.vitaliy@xxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx >> > > Subject: Re: [PATCH] nfs: fix inifinite loop at >> > > nfs4_layoutcommit_release >> > > >> > > On Thu, Sep 8, 2011 at 8:01 PM, Myklebust, Trond >> > > <Trond.Myklebust@xxxxxxxxxx> wrote: >> > > >> -----Original Message----- >> > > >> > > > >> > > > Yes, but as far as I can see, even in the blocks case there can be >> > > multiple extents per layout segment. What if I write to one >> > > uninitialised extent, layoutcommit, then write to another uninitialized >> > > extent in the same layout segment and layoutcommit? In my reading of >> > > the code, there is a chance that the second layoutcommit will fail to >> > > pick up the layout segment, and so will fail to notify the MDS that the >> > > second extent now contains data. >> > > >> > > blocklayout does not decide what to layoutcommit according to the lseg >> > > list. Instead, it keeps track of each extent's state at the >> > > granularity of blocksize, and encode whatever needs layoutcommitted in >> > > the layoutcommit call. So in your above case, as long as the second >> > > layoutcommit is issued, blocklayout will encode the newly written >> > > extent in the second layoutcommit call, even if the lseg is not >> > > attached to the second layoutcommit. >> > > >> > > But that leads to another two question: >> > > 1. How do we protect against layoutrecall if lseg is not linked to >> > > layoutcommit? For this one, can we just reject layoutrecall if there >> > > is inflight layoutcommit? It will be less parallel but can guarantee >> > > current implementation correctness. >> > > 2. blocklayout ONLY: bl_committing may be overloaded by several >> > > layoutcommit calls and we don't have information in >> > > cleanup_layoutcommit() on how many entry should be removed from >> > > bl_committing. Maybe we can add a (void*) to struct >> > > nfs4_layoutcommit_data, so that LD can pass some private information >> > > between encode_layoutcommit() and cleanup_layoutcommit()? >> > >> > 3. What is the purpose of pinning the layout segment at all if neither blocks, nor >> > objects nor files cares? >> I believe it is for protecting against layoutrecall. But since we are seperating lseg and LD specific layout information management, it is actually not working as expected. >> >> > IOW: if both objects and blocks track the information that they need for >> > layoutcommit outside the layout segments, then why do we need the >> > NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list at all? >> If we remove NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list, we must find other method to protect agains freeing lseg while layoutcommit is needed or is going on. e.g., check for NFS_INO_LAYOUTCOMMIT bit and inflight layoutcommit in initiate_file_draining(). > > The easiest solution is to ensure we have only _one_ LAYOUTCOMMIT on the > wire at a time. Otherwise, you are looking at a many-to-many mapping > between layoutcommits and lsegs. > > We should not expect to need multiple layoutcommits on the wire if pNFS > is working smoothly (i.e. no layout recalls), so optimising for that > case is wrong. > I'd also think that we want to order layoutcommit and ordinary commits > (for those NFS files servers that require both) so that we don't end up > writing a file size change to disk before the actual data is committed. > > So why not just protect the layoutcommit call using the existing > nfs_commit_set_lock/nfs_commit_clear_lock? Completely agree. We should serialize layoutcommit instead of trying to handle multiple of them concurrently. -- Thanks, -Bergwolf -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html