Re: [PATCH] pnfs: Kick a pnfs_layoutcommit_inode on recall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 26, 2014 at 2:41 PM, Trond Myklebust
<trond.myklebust@xxxxxxxxxxxxxxx> wrote:
> On Tue, Aug 26, 2014 at 2:19 PM, Boaz Harrosh <boaz@xxxxxxxxxxxxx> wrote:
>> On 08/26/2014 08:54 PM, Trond Myklebust wrote:
>>> On Tue, Aug 26, 2014 at 1:06 PM, Boaz Harrosh <boaz@xxxxxxxxxxxxx> wrote:
>>
>>>
>>> The deadlock occurs _if_ the above layout commit  is unable to get a
>>> slot. You can't guarantee that it will, because the slot table is a
>>> finite resource and it can be exhausted
>>
>> Yes all I ever seen is 1 slot in any of the clients/servers I've
>> seen so I assume 1 slot ever
>>
>>> if you allow fore channel
>>> calls to trigger synchronous recalls on the back channel
>>
>> Beep! but this is exactly what I'm trying to say. The STD specifically
>> forbids that. The server is not allowed to wait here, it must return
>> imitatively, with an error that frees the slot and then later issue the
>> RECALL.
>>
>> This is what I said exactly three times in my mail, and what I have
>> depicted in my flow:
>>         Server async operation (mandated by the STD)
>>         Client back-channel can be sync with for channel (Not mentioned by the STD)
>>
>>> that again trigger synchronous calls on the fore channel.
>>
>>
>>> You're basically saying
>>> that the client needs to guarantee that it can allocate 2 slots before
>>> it is allowed to send a layoutget just in case the server needs to
>>> recall a layout.
>>>
>>
>> No I am not saying that, please count. Since the Server is not allowed
>> sync operation then one slot is enough and the client can do sync lo_commit
>> while in recall.
>>
>>> If, OTOH, the layoutcommit is asynchronous, then there is no
>>> serialisation and the back channel thread can happily reply to the
>>> layout recall even if there are no free slots in the fore channel.
>>>
>>
>> Sure that will work as well, but not optimally, and for no good reason.
>>
>> Please go back to my flow with the 3 cases. See how the server never waits
>> for anything and will always imitatively reply to the layout_get.
>> Since the server is not allowed a sync operation and is mandated by the
>> RFC text to not wait, then the client is allowed and can do sync operations
>> because it is enough that only one do async.
>>
>> BTW: If what you are saying is true than there is a bug in the slot code
>> because this patch does work, and everything flows past this situation.
>> I have a reproducer test that fails 100% of the time without this patch
>> and only fails much later at some other place, but not at this deadlock,
>> with this patch applied.
>>
>> Cheers
>> Boaz
>>
>
> Whether or not your particular server allows it or not is irrelevant.
> We're not coding the client to a particular implementation. None of
> the other callbacks do synchronous RPC calls, and that's very
> intentional.
>

So to return to the original question: could we please change the
layoutcommit in your patch so that it is asynchronous?

-- 
Trond Myklebust

Linux NFS client maintainer, PrimaryData

trond.myklebust@xxxxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux