Bucket index op - lock contention hang op threads

GuangYang <yguang11@xxxxxxxxxxx> · Thu, 5 Feb 2015 09:36:14 +0000

Hi ceph-devel,
In our ceph cluster (with rgw), we came across a problem that all rgw process are stuck (all worker threads wait for the response from OSD, and start giving 500 to clients). objecter_requests dump showed the slow in flight requests were caused by one OSD, which has 2 PGs doing backfilling and it has 2 bucket index objects.

At OSD side, we configure 8 threads, it turned out when this problem occurred, several op threads took seconds (even tens of seconds) handling bucket index op, with most of time waiting for the ondisk_read_lock. As a result, the throughput of the op threads drop (qlen increasing).

I am wondering what options we can pursue to improve the situation, some general ideas on my mind:
 1> Similar to OpContext::rwstate, instead of make the op thread stuck, put this op to a waiting list and notify upon lock available. I am not sure if this worth it or break anything.
 2> Differentiate the service class at filestore level for such OP - somebody is waiting for its release of the lock. Does this break any assumption at filestore layer?

As we are using EC (8+3), the fan out is more than replication pool, such kind of slow from one OSD could be cascading to more OSDs easier.

BTW, I created a tracker for this - http://tracker.ceph.com/issues/10739

Look forward to your suggestions.

Thanks,
Guang 		 	   		  --
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html