Re: rpc sleep user task, will never wake up if response from server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chuck,

Thanks for your response, but I try to mount via udp, it still cannot
wake up user task,

The log just show one time "not responding, still trying", and then
keep retry, even without disconnect notification.

[57993.843932] RPC:   172 xprt_prepare_transmit
[57993.843935] RPC:   172 xprt_cwnd_limited cong = 0 cwnd = 512
[57993.843939] RPC:   172 xprt_transmit(92)
[57996.840019] RPC:   172 xprt_prepare_transmit
[57996.840026] RPC:   172 xprt_transmit(92)
[57999.840019] RPC:   172 xprt_prepare_transmit
[57999.840025] RPC:   172 xprt_transmit(92)
[58002.840016] nfs: server 192.168.100.91 not responding, still trying
[58002.840022] RPC:   172 xprt_prepare_transmit
[58002.840027] RPC:   172 xprt_transmit(92)
[58005.840016] RPC:   172 xprt_prepare_transmit
[58005.840022] RPC:   172 xprt_transmit(92)
[58008.840015] RPC:   172 xprt_prepare_transmit
[58008.840021] RPC:   172 xprt_transmit(92)
[58011.844024] RPC:   172 xprt_prepare_transmit

Regards,
Ivan

On Mon, Oct 11, 2010 at 11:17 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> Take a look at http://nfs.sourceforge.net/
>
> The upshot is that NFS defaults to "hard" mounts which means clients will retry an operation until it succeeds, possibly forever.  It sounds like you want a "soft" mount instead, which will time out operations that get no response.
>
> Note that using "soft" may impact data integrity.
>
> On Oct 11, 2010, at 4:46 AM, Ivan Chan wrote:
>
>> Hi all,
>>
>>    I am facing a problem in 2.6.32-31 kernel, when I try to fetch
>> something from nfs server, and if there is no response server, for
>> example loss IP.
>> The user task will hang, and after scanning the RPC code, I think the
>> user task is actualy being slept by RPC, and will not wake up until it
>> get response from server.
>> The following is the log, it show the RPC retry, and finally
>> disconnect, However the use applcaiton is still hang/(sleep).  I am
>> new to this, I don't know it is a design purpose for some condition or
>> it should be fixed in generel. But in my use case, I hope the user
>> application can know there is error from NFS/RPC, such that it can
>> dicide to do the fail response. But I don't read the RPC code in deep
>> yet,
>> would you please suggest me which part I should focus , thanks a lot.
>>
>> [363855.998281] NFS: permission(0:18/1443226), mask=0x1, res=0
>> [363855.998286] NFS: dentry_delete(/out_A1.mp4, 88)
>> [363855.998303] NFS: permission(0:18/1443226), mask=0x1, res=0
>> [363855.998308] NFS: dentry_delete(/out_A3.mp4, 88)
>> [363855.998325] NFS: permission(0:18/1443226), mask=0x1, res=0
>> [363855.998329] NFS: dentry_delete(/zero, 88)
>> [363895.460527] RPC:   172 reserved req c344a000 xid 5f9ff6fe
>> [363895.460531] RPC:   172 xprt_prepare_transmit
>> [363895.460536] RPC:   172 xprt_transmit(96)
>> [363895.460556] RPC:   172 xmit complete
>> [363955.460015] RPC:   172 xprt_timer
>> [363955.460020] RPC:   172 xprt_prepare_transmit
>> [363955.460025] RPC:   172 xprt_transmit(96)
>> [363955.460037] RPC:   172 xmit complete
>> [364075.460015] RPC:   172 xprt_timer
>> [364075.460021] nfs: server 192.168.100.91 not responding, still trying
>> [364075.460025] RPC:   172 xprt_prepare_transmit
>> [364075.460030] RPC:   172 xprt_transmit(96)
>> [364075.460042] RPC:   172 xmit complete
>> [364135.460015] RPC:   172 xprt_timer
>> [364135.460020] RPC:   172 xprt_prepare_transmit
>> [364135.460026] RPC:   172 xprt_transmit(96)
>> [364135.460038] RPC:   172 xmit complete
>> [364255.460017] RPC:   172 xprt_timer
>> [364255.460022] RPC:   172 xprt_prepare_transmit
>> [364255.460027] RPC:   172 xprt_transmit(96)
>> [364255.460039] RPC:   172 xmit complete
>> [364315.460023] RPC:   172 xprt_timer
>> [364315.460028] RPC:   172 xprt_prepare_transmit
>> [364315.460033] RPC:   172 xprt_transmit(96)
>> [364315.460045] RPC:   172 xmit complete
>> [364435.460017] RPC:   172 xprt_timer
>> [364435.460023] RPC:   172 xprt_prepare_transmit
>> [364435.460028] RPC:   172 xprt_transmit(96)
>> [364435.460041] RPC:   172 xmit complete
>> [364495.460016] RPC:   172 xprt_timer
>> [364495.460021] RPC:   172 xprt_prepare_transmit
>> [364495.460026] RPC:   172 xprt_transmit(96)
>> [364495.460038] RPC:   172 xmit complete
>> [364615.460016] RPC:   172 xprt_timer
>> [364615.460022] RPC:   172 xprt_prepare_transmit
>> [364615.460027] RPC:   172 xprt_transmit(96)
>> [364615.460039] RPC:   172 xmit complete
>> [364675.460016] RPC:   172 xprt_timer
>> [364675.460021] RPC:   172 xprt_prepare_transmit
>> [364675.460026] RPC:   172 xprt_transmit(96)
>> [364675.460039] RPC:   172 xmit complete
>> [364795.460017] RPC:   172 xprt_timer
>> [364795.460022] RPC:   172 xprt_prepare_transmit
>> [364795.460027] RPC:   172 xprt_transmit(96)
>> [364795.460040] RPC:   172 xmit complete
>> [364855.460015] RPC:   172 xprt_timer
>> [364855.460020] RPC:   172 xprt_prepare_transmit
>> [364855.460025] RPC:   172 xprt_transmit(96)
>> [364855.460037] RPC:   172 xmit complete
>> [364944.152011] RPC:       disconnected transport c3682000
>> [364944.152021] RPC:   172 xprt_prepare_transmit
>> [364944.152026] RPC:   172 xprt_transmit(96)
>> [364944.152031] RPC:       disconnected transport c3682000
>> [364944.152037] RPC:   172 xprt_connect xprt c3682000 is not connected
>> [364944.152052] RPC:       disconnected transport c3682000
>> [364944.152058] RPC:   172 xprt_connect_status: retrying
>> [364944.152061] RPC:   172 xprt_prepare_transmit
>> [364944.152064] RPC:   172 xprt_transmit(96)
>> [364944.152066] RPC:       disconnected transport c3682000
>> [364944.152070] RPC:   172 xprt_connect xprt c3682000 is not connected
>>
>> Regards,
>> Ivan
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> chuck[dot]lever[at]oracle[dot]com
>
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux