RE: ContextQueue vs op_wq race?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



sorry forget cc ceph-devel.

-----Original Message-----
From: Ma, Jianpeng 
Sent: Wednesday, October 17, 2018 11:16 AM
To: Sage Weil <sweil@xxxxxxxxxx>; xie.xingguo@xxxxxxxxxx
Cc: Ma, Jianpeng <jianpeng.ma@xxxxxxxxx>
Subject: RE: ContextQueue vs op_wq race?

Hi Sage, xingguo:

Yes there is a race, see the following code:


		ThreadA												ThreadB
  sdata->shard_lock.Lock();                                                                                              
  if (sdata->pqueue->empty() &&                                                                                          
     !(is_smallest_thread_index && !sdata->context_queue.empty())) {                                                     
  
												 void queue(list<Context *>& ls) {
  											 		bool empty = false;
  										  			{   
     	 													std::scoped_lock l(q_mutex);
      														if (q.empty()) {
     									   					  q.swap(ls);
        													  empty = true;
      														} else {
        													  q.insert(q.end(), ls.begin(), ls.end());
      														}   
    													 }   

    													if (empty) {
     														 mutex.Lock();
     														 cond.Signal();
      														 mutex.Unlock();
  													 } 
												}  





     sdata->sdata_wait_lock.Lock();                                                                                    
    if (!sdata->stop_waiting) {                                                                                          


I modified the code as following and think this can fix the above race.

  sdata->shard_lock.Lock();                                                                                              
  if (is_smallest_thread_index && sdata->pqueue->empty()) {                                                              
    sdata->sdata_wait_lock.Lock();   //check context_queue must be locked by sdata_wait.
    if (sdata->context_queue.empty()) {
      if (!sdata->stop_watiting) {
        dout(20) << __func__ << " empty q, waiting" << dendl;                                                            
        osd->cct->get_heartbeat_map()->clear_timeout(hb);                                                                
        sdata->shard_lock.Unlock();                                                                                      
        sdata->sdata_cond.Wait(sdata->sdata_wait_lock);                                                                  
        sdata->sdata_wait_lock.Unlock();                                                                                 
        sdata->shard_lock.Lock();                                                                                        
        if (sdata->pqueue->empty() && sdata->context_queue.empty()) {                                        
          sdata->shard_lock.Unlock();                                                                                    
          return;                                                                                                        
        }                                                                                                                
        osd->cct->get_heartbeat_map()->reset_timeout(hb,                                                                 
            osd->cct->_conf->threadpool_default_timeout, 0);                                                             
      } else {
        dout(20) << __func__ << " need return immediately" << dendl;
        sdata->sdata_wait_lock.Unlock();
        sdata->shard_lock.Unlock();
        return;
      }
    } else {                                                                                                             
        sdata->sdata_wait_lock.Unlock();
    }

  } else if (sdata->pqueue->empty()) {
    ;
  }

BTY: why not remove sdata_wait_lock ?  i think it safe to remove sdata_wait_lock. Or am I missing something?

Jianpeng
Thanks!


-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Ma, Jianpeng
Sent: Wednesday, October 17, 2018 9:59 AM
To: Sage Weil <sweil@xxxxxxxxxx>
Cc: ceph-devel@xxxxxxxxxxxxxxx
Subject: RE: ContextQueue vs op_wq race?

Ok, I'll check this now.

Thanks!
Jianpeng

-----Original Message-----
From: Sage Weil [mailto:sweil@xxxxxxxxxx] 
Sent: Wednesday, October 17, 2018 6:55 AM
To: Ma, Jianpeng <jianpeng.ma@xxxxxxxxx>
Cc: ceph-devel@xxxxxxxxxxxxxxx
Subject: ContextQueue vs op_wq race?

Hi Jianpeng,

Can you take a look at http://tracker.ceph.com/issues/36473 and see if you have any ideas?  Does that possible race make sense?

Thanks!
sage




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux