Re: osd down when run fio randwrite 4k using bluestore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 11 Jul 2017, Wangwenfeng wrote:
> It is still exist in the latest master which contains 16051.

If you can reproduce this, can you please generate a log with 'debug 
bluestore = 20' on the osd?

Thanks!
sage

> 
> 
> 
> I suspect this is the recent deferred_aggressive deadlock; the fix merged last week.  Please try the latest master branch and see if you can reproduce.
> 
> https://github.com/ceph/ceph/pull/16051
> 
> Thanks!
> sage
> 
> 
> On Mon, 10 Jul 2017, Wangwenfeng wrote:
> 
> > 
> > Hi, Sage
> >  I setup a Ceph cluster of Luminous 12.0.3, it’s osd use bluestore and I create a cephfs, it’s metadata using replicated and data pool using erasure.
> >    pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 52 flags hashpspool stripe_width 0
> >    pool 2 'EC_2_1_8' erasure size 3 min_size 2 crush_ruleset 2 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 56 flags hashpspool,ec_overwrites stripe_width 8192 expected_num_objects 27000000
> >   When run fio to test the cluster, which command is
> >    fio --numjobs=16 --iodepth=16 --ioengine=libaio --runtime=600 
> > --direct=1 --group_reporting --rw=randwrite --bs=4k --name=aa 
> > --filename=/ec/1.txt --size=500G
> > 
> >   for a while time, some osds is reported down, which also in 12.1.0 and master.
> > 
> >   I have a question about follow code
> > void BlueStore::_deferred_queue(TransContext *txc) {
> >   dout(20) << __func__ << " txc " << txc << " osr " << txc->osr << dendl;
> >   std::lock_guard<std::mutex> l(deferred_lock);
> >   …………………………
> >   if (deferred_aggressive &&
> >       !txc->osr->deferred_running) {
> >     _deferred_submit(txc->osr.get());
> >   }
> > }
> > 
> > If I add '!' to deferred_aggressive, the osd will not down. Would you help to point this modify is right or not?
> >   if (!deferred_aggressive &&
> >       !txc->osr->deferred_running) {
> >     _deferred_submit(txc->osr.get());
> >   }
> > 
> > 
> > ----------------------------------------------------------------------
> > ---------------------------------------------------------------
> > 本邮件及其附件含有新华三技术有限公司的保密信息,仅限于发送给上面地址中列出
> > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> > 邮件!
> > This e-mail and its attachments contain confidential information from 
> > New H3C, which is intended only for the person or entity whose address 
> > is listed above. Any use of the information contained herein in any 
> > way (including, but not limited to, total or partial disclosure, 
> > reproduction, or dissemination) by persons other than the intended
> > recipient(s) is prohibited. If you receive this e-mail in error, 
> > please notify the sender by phone or email immediately and delete it!
> > 
> N?叉??y????b????千v????藓{.n??????z鳐?ay????????j???f"???????ア???⒎???:+v?????????赙zZ+??????"?!?

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux