On Wed, 2011-09-21 at 23:45 +0800, Peng Tao wrote: > On Wed, Sep 21, 2011 at 9:56 PM, Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote: > > On 09/21/2011 02:50 PM, Benny Halevy wrote: > >> On 2011-09-21 14:42, Boaz Harrosh wrote: > >>> On 09/21/2011 02:27 PM, Benny Halevy wrote: > >>>>> Unless we do following: > >>>>> 1. preallocate memory for extent state convertion > >>>>> 2. use nfsiod/rpciod to handle bl_write_cleanup > >>>>> 3. for pnfs error case, create a kthread to recollapse and resend to MDS > >>>>> I don't quite understand. How do you use nfs state manager to do other tasks? > >>>> > >>>> You need to keep a list of things to do hanging off of the nfs client structure > >>>> and set a bit in cl_state telling the state manager it has work to do > >>>> and wake it up. It then needs to go over the list of, say nfs_inodes > >>>> and call into the layout driver to handle the errors. > >>>> > >>>> Benny > >>> > >>> Good god, Is it not already too complicated? > >>> > >>> The LD is out of the picture. You all seemed to agree that > >>> the LD has reported an io_done on the nfsiod/rpciod, and in the error case > >>> Generic layer needs to do it's coalescing on some other thread. So > >>> your description above is not correct, the LD is out of the picture. > >>> > >> > >> True, if the ld cleanup on io_done is sufficient. > >> > >>> It all looks too complicated for me. A pnfs workqueue for both 2 and 3 > >>> above is very good. Specially since the workqueue also shares global > >>> pool threads, No? I like it that there is a preallocated thread for > >>> the error-case, think about it. > >> > >> I'm fine too with using a workqueue for the error case. > >> But I'd rather have the common case done path do only lightweight, > >> wait free processing. > >> > >> Benny > >> > > > > If by "common case done path do only lightweight" you mean > > "preallocate memory for extent state conversion". Then I absolutely > > agree. But as far as workqueue/kthread then nfsiod/rpciod-wq or > > pnfs-wq is exactly the same for the "common case". Unless I'm > > totally missing the point. What are you saying? > > > > These are the options so far: > > > > [Toe's option which he rather not] > > 1. preallocate memory for extent state conversion > > 2. use nfsiod/rpciod to handle bl_write_cleanup > > 3. for pnfs error case, create a kthread to recollapse and resend to MDS > > > > [My option which I think Toe agrees with] > > 1. preallocate memory for extent state conversion > > 2. use pnfs-wq to handle bl_write_cleanup > > 3. pnfs error case, just like Toe's patches as part of io_done > > on pnfs-wq > Yeah, I would vote for this one because of its simplicity. ;-) Sigh... The problem is that it completely fails to address the problem. What's the difference between having pNFS completions run on nfsiod or their own work queue? You'd be running i/o and allocations on the same queue in both cases. Cheers Trond -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html