Re: Status of Fencing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 14 Sep 2012, Josh Durgin wrote:
> On 09/14/2012 05:14 PM, Mandell Degerness wrote:
> > I was wondering if there was an update on the RBD fencing question.
> > Either knowing when an RBD is mounted elsewhere or being able to
> > enforce a fence on an RBD would be really helpful.
> > 
> > -Mandell Degerness
> 
> Advisory locking of rbd images is ready for review in the wip-librbd-locking
> branch. This lets you lock/unlock images with the rbd cli tool,
> so you could detect when it's being used, but fencing is not
> implemented yet.

After talking it over with a few people, I have some confidence that the 
'incremental' fencing scheme I suggested a few weeks back will work... 
certainly with btrfs, and almost certainly with XFS.  Basically, this 
amounts to:

 - identify old rbd lock holder (rbd lock list <img>)
 - blacklist old owner (ceph osd blacklist add <addr>)
 - break old rbd lock (rbd lock remove <img> <lockid> <addr>)
 - lock rbd image on new host (rbd lock add <img> <lockid>)
 - map rbd image on new host

The trick is making sure that the new image owner knows about the osdmap 
that includes the new blacklist... this ensures that any rbd objects that 
client reads is no longer touchable by the old guy.  This is probably 
always the case if the mounting/mapping kernel is intracting with the 
cluster for the first time, but I'd like to have a positive confirmation 
of that, and/or a general way to make sure a client has at last a given 
osdmap even in the case where it already has a cluster session (because 
another image is mapped).

Anyway, enough pieces are already in place that you should be able to play 
with this.  The main piece of work we need for this scheme is a test that 
continually yanks a filesystem away from a VM/host, remounts, and 
continues, while verifying things don't go sour.

Moving forward, we also need to make the rbd clients take/release locks 
themselves (instead of the user doing so explicitly), and make the 
'takeover' process a bit more streamlined.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux