Hi, currently we are evaluating the performance of rbd-nbd module. we found that the performance is much lower (only half) than directly using rbd when there are multiple clients (fio jobs>=16). we tried adding multi-connections support on rbd-nbd with newest nbd driver, so that the nbd driver can create multiple io queues, and each io queue is associated with one socket connection to talk to rbd-nbd for request sending and response receiving. however, this doesn't give us expected performance gain (~5%-10%). it still has obvious gap with directly using rbd. We did a little hack by responding the nbd device with success at rbd-nbd layer without forwarding to ceph cluster. this gives us a hint about the overhead of nbd device communications with rbd-nbd module, the latency is around 30us for single IO queue depth. so we are considering if the nbd device import very little overhead, and it's scalable locally, then what's the main issue stopping the scalability of rbd-nbd sending requests to ceph cluster? if anyone can give us a hint, it would much appreciated. Thanks, Sheng -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html