Re: [Feature]Proposal for adding a new flag named shared to support performance and statistic purpose

Wido den Hollander <wido@xxxxxxxx> · Thu, 05 Jun 2014 09:25:33 +0200

On 06/05/2014 09:01 AM, Haomai Wang wrote:
Hi,
Previously I sent a mail about the difficult of rbd snapshot size
statistic. The main solution is using object map to store the changes.
The problem is we can't handle with multi client concurrent modify.

Lack of object map(like pointer map in qcow2), it cause many problems
in librbd. Such as clone depth, the deep clone depth will cause
remarkable latency. Usually each clone wrap will increase two times of
latency.

I consider to make a tradeoff between multi-client support and
single-client support for librbd. In practice, most of the
volumes/images are used by VM, there only exist one client will
access/modify image. We can't only want to make shared image possible
but make most of use cases bad. So we can add a new flag called
"shared" when creating image. If "shared" is false, librbd will
maintain a object map for each image. The object map is considered to
durable, each image_close call will store the map into rados. If the
client  is crashed and failed to dump the object map, the next client
open the image will think the object map as out of date and reset the
objectmap.

Why not flush out the object map every X period? Assume a client runs 
for weeks or months and you would keep that map in memory all the time 
since the image is never closed.

We can easily find the advantage of this feature:
1. Avoid clone performance problem
2. Make snapshot statistic possible
3. Improve librbd operation performance including read, copy-on-write operation.

What do you think above? More feedbacks are appreciate!

--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html