On 05/17/2010 01:33 PM, Boaz Harrosh wrote: > On 05/17/2010 12:59 PM, Zhang Jingwang wrote: >> These two functions mustn't be called from the same workqueue. Otherwise >> deadlock may occur. So we schedule the return_layout_barrier to nfsiod. >> nfsiod may not be a good choice, maybe we should setup a new workqueue >> to do the job. > > Please give more information. When does it happen that pnfs_XXX_done will > return -EAGAIN? > > What is the stack trace of the deadlock? > > And please rebase that patch on the latest changes to _pnfs_return_layout(). > but since in the new code _pnfs_return_layout() must be called with NO_WAIT > if called from the nfsiod then you cannot call pnfs_initiate_write/read() right > after. For writes you can get by with doing nothing because the write-back > thread will kick in soon enough. For reads I'm not sure, you'll need to send > me more information, stack trace. > > Or you can wait for the new state machine. > > Boaz > BTW: I agree that current code is crap. Do to bugs in the osd library we never return -EAGAIN. so I never tried that code. But it should theoretically trigger when an OSD reboots or a network connection fails. >> >> Signed-off-by: Zhang Jingwang <zhangjingwang@xxxxxxxxxxxx> >> --- Thanks Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html