Re: [PATCH] libceph: wake up sync waiter when unregistering request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 15 Dec 2016, at 05:58, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> 
> On Wed, Nov 30, 2016 at 11:02 AM, Yan, Zheng <zyan@xxxxxxxxxx> wrote:
>> 
>>> On 30 Nov 2016, at 17:06, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>>> 
>>> On Wed, Nov 30, 2016 at 1:30 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>>>> On Tue, Nov 29, 2016 at 11:29 PM, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>>>>> On Tue, Nov 29, 2016 at 4:03 PM, Yan, Zheng <zyan@xxxxxxxxxx> wrote:
>>>>>> 
>>>>>>> On 29 Nov 2016, at 22:58, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> On Tue, Nov 29, 2016 at 2:59 PM, Yan, Zheng <zyan@xxxxxxxxxx> wrote:
>>>>>>>> 
>>>>>>>>> On 29 Nov 2016, at 21:54, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>>>>>>>>> 
>>>>>>>>> On Tue, Nov 29, 2016 at 12:49 PM, Yan, Zheng <zyan@xxxxxxxxxx> wrote:
>>>>>>>>>> Current code does not wake up sync waiter if osd replies error
>>>>>>>>>> or the request does not want ondisk ack.
>>>>>>>>> 
>>>>>>>>> Hi Zheng,
>>>>>>>>> 
>>>>>>>>> Did you mean to write "and" instead of "or" - "if osd replies error
>>>>>>>>> *and* the request does not want ondisk ack”?
>>>>>>>> 
>>>>>>>> I mean ‘or’.  The code does not wake up sync waiter in either of these two cases.
>>>>>>> 
>>>>>>> It does wake the sync waiter (r_safe_completion) on error, but only
>>>>>>> if ondisk ack was requested.  If ondisk ack wasn't requested, it's not
>>>>>>> woken up, success or error.
>>>>>>> 
>>>>>>> By not setting CEPH_OSD_FLAG_ONDISK, the submitter is indicating that
>>>>>>> *any* reply is fine.  Blocking on the safe reply instead of any reply
>>>>>>> in the !CEPH_OSD_FLAG_ONDISK case sounds wrong.
>>>>>>> 
>>>>>>> What is the motivation for this patch?
>>>>>> 
>>>>>> I saw fsstress hang at ceph_osdc_sync several times.
>>>>>> 
>>>>>> [52310.533483] sysrq: SysRq : Show Blocked State
>>>>>> [52310.534057]   task                        PC stack   pid father
>>>>>> [52310.534653] fsstress        D14656  2836   2834 0x00000000
>>>>>> [52310.535568]  ffff880036cff200 ffff88003e2dac80 0000000000002fa3 ffff88003e3142c0
>>>>>> [52310.536529]  ffff88003fd18318 ffffc900038b7d48 ffffffff819354cc ffffffff00000000
>>>>>> [52310.537464]  000000013e314ae0 ffff88003e314890 ffff88003fd18318 ffff88003fd18300
>>>>>> [52310.538396] Call Trace:
>>>>>> [52310.538762]  [<ffffffff819354cc>] ? __schedule+0x6ac/0x880
>>>>>> [52310.539326]  [<ffffffff81935713>] schedule+0x73/0x90
>>>>>> [52310.539840]  [<ffffffff81939d21>] schedule_timeout+0x31/0x5d0
>>>>>> [52310.540425]  [<ffffffff810ac4a6>] ? mark_held_locks+0x76/0x90
>>>>>> [52310.540980]  [<ffffffff8193b447>] ? _raw_spin_unlock_irq+0x27/0x50
>>>>>> [52310.541582]  [<ffffffff810ac64d>] ? trace_hardirqs_on_caller+0x18d/0x1c0
>>>>>> [52310.542217]  [<ffffffff81936a1b>] wait_for_completion+0x8b/0x100
>>>>>> [52310.542788]  [<ffffffff8108dc10>] ? wake_up_q+0x70/0x70
>>>>>> [52310.543380]  [<ffffffffa00116b8>] ceph_osdc_sync+0xc8/0x150 [libceph]
>>>>>> [52310.543990]  [<ffffffff811eaed0>] ? SyS_tee+0x3b0/0x3b0
>>>>>> [52310.544569]  [<ffffffffa005c3ea>] ceph_sync_fs+0x3a/0xb0 [ceph]
>>>>>> [52310.545158]  [<ffffffff811eaeeb>] sync_fs_one_sb+0x1b/0x20
>>>>>> [52310.545699]  [<ffffffff811b931e>] iterate_supers+0x7e/0xe0
>>>>>> [52310.546258]  [<ffffffff811eb1e0>] sys_sync+0x50/0x80
>>>>>> [52310.546849]  [<ffffffff8193bb6a>] entry_SYSCALL_64_fastpath+0x18/0xad
>>>>> 
>>>>> Did you happen to capture /osdc?  It's probably the pool perm check
>>>>> thing
>>>> 
>>>> osdc is complete empty.
>>> 
>>> If you haven't already done so while investigating, could you please
>>> dump the offending @req fields?  I want to make sure it's the pool perm
>>> check request…
>> 
>> Sorry, I have already rebooted that machine.
> 
> Hi Zheng,
> 
> I took it a step further and also fixed libceph cancellation code:
> 
> https://github.com/ceph/ceph-client/commit/c297eb42690b904fb5b78dd9ad001bafe25f49ec
> 

looks good

Yan, Zheng


> I squashed your patch to reduce rename churn.  Let me know if that's OK
> with you.
> 
> Thanks,
> 
>               Ilya

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux