RADOS EC: is it okay to reduce the number of commits required for reply to client?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Cephers,

We are testing the write performance of Ceph EC (Luminous, 8 + 4), and
noticed that tail latency is extremly high. Say, avgtime of 10th
commit is 40ms, acceptable as it's an all HDD cluster; 11th is 80ms,
doubled; then 12th is 160ms, doubled again, which is not so good. Then
we made a small modification and tested again, and did get a much
better result. The patch is quite simple (for test only of course):

--- a/src/osd/ECBackend.cc
+++ b/src/osd/ECBackend.cc
@@ -1188,7 +1188,7 @@ void ECBackend::handle_sub_write_reply(
     i->second.on_all_applied = 0;
     i->second.trace.event("ec write all applied");
   }
-  if (i->second.pending_commit.empty() && i->second.on_all_commit) {
+  if (i->second.pending_commit.size() == 2 &&
i->second.on_all_commit) {  // 8 + 4 - 10 = 2
     dout(10) << __func__ << " Calling on_all_commit on " << i->second << dendl;
     i->second.on_all_commit->complete(0);
     i->second.on_all_commit = 0;

As far as what I see, everything still goes well (maybe because of the
rwlock in primary OSD? not sure though), but I'm afraid it might break
data consistency in some ways not aware of. So I'm writing to ask if
someone could kindly provide expertise comments on this or maybe share
any known drawbacks. Thank you!

PS: OSD is backended with filestore, not bluestore, if that matters.

Regards,
Alex



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux