Re: rbd kernel client fencing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you so much.

The blacklist entries are stored in osd map, which is supposed to be tiny and clean. 
So we are doing similar cleanups after reboot.

I’m quite interested in how the host commit suicide and reboot,

can you successfully umount the folder and unmap the rbd block device

after it is blacklisted?

I wonder whether the IO will hang and the umount process will stop at D state

thus the host cannot be shutdown since it is waiting for the umount to finish

==============================

and now that cento 7.3 kernel support exclusive lock feature,

could anyone give out new flow of failover ?

Thanks.


> On 20 Apr 2017, at 6:31 AM, Kjetil Jørgensen <kjetil@xxxxxxxxxxxx> wrote:
> 
> Hi,
> 
> As long as you blacklist the old owner by ip, you should be fine. Do
> note that rbd lock remove implicitly also blacklists unless you also
> pass rbd lock remove the --rbd_blacklist_on_break_lock=false option.
> (that is I think "ceph osd blacklist add a.b.c.d interval" translates
> into blacklisting a.b.c.d:0/0 - which should block every client with
> source ip a.b.c.d).
> 
> Regardless, I believe the client taking out the lock (rbd cli) and the
> kernel client mapping the rbd will be different (port, nonce), so
> specifically if it is possible to blacklist a specific client by (ip,
> port, nonce) it wouldn't do you much good where you have different
> clients dealing with the locking and doing the actual IO/mapping (rbd
> cli and kernel).
> 
> We do a variation of what you are suggesting, although additionally we
> check for watches, if watched we give up and complain rather than
> blacklist. If previous lock were held by my ip we just silently
> reclaim. The hosts themselves run a process watching for
> blacklistentries, and if they see themselves blacklisted they commit
> suicide and re-boot. On boot, machine removes blacklist, reclaims any
> locks it used to hold before starting the things that might map rbd
> images. There's some warts in there, but for the most part it works
> well.
> 
> If you are going the fencing route - I would strongly advise you also
> ensure your process don't end up with the possibility of cascading
> blacklists, in addition to being highly disruptive, it causes osd(?)
> map churn. (We accidentally did this - and ended up almost running our
> monitors out of disk).
> 
> Cheers,
> KJ
> 
> On Wed, Apr 19, 2017 at 2:35 AM, Chaofan Yu <chaofanyu@xxxxxxxxxxx> wrote:
>> Hi list,
>> 
>>  I wonder someone can help with rbd kernel client fencing (aimed to avoid
>> simultaneously rbd map on different hosts).
>> 
>> I know the exclusive rbd image feature is added later to avoid manual rbd
>> lock CLIs. But want to know previous blacklist solution.
>> 
>> The official workflow I’ve got is listed below (without exclusive rbd
>> feature) :
>> 
>> - identify old rbd lock holder (rbd lock list <img>)
>> - blacklist old owner (ceph osd blacklist add <addr>)
>> - break old rbd lock (rbd lock remove <img> <lockid> <addr>)
>> - lock rbd image on new host (rbd lock add <img> <lockid>)
>> - map rbd image on new host
>> 
>> 
>> The blacklisted entry identified by entity_addr_t (ip, port, nonce).
>> 
>> However as far as I know, ceph kernel client will do socket reconnection if
>> connection failed. So I wonder in this scenario it won’t work:
>> 
>> 1. old client network down for a while
>> 2. perform below steps on new host to achieve failover
>> - identify old rbd lock holder (rbd lock list <img>)
>> 
>> - blacklist old owner (ceph osd blacklist add <addr>)
>> - break old rbd lock (rbd lock remove <img> <lockid> <addr>)
>> - lock rbd image on new host (rbd lock add <img> <lockid>)
>> - map rbd image on new host
>> 
>> 3. old client network come back and reconnect to osds with new created
>> socket client, i.e. new (ip, port,nonce) turple
>> 
>> as a result both new and old client can write to same rbd image, which might
>> potentially cause the data corruption.
>> 
>> So does this mean if kernel client does not support exclusive-lock image
>> feature, fencing is not possible ?
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> 
> 
> 
> -- 
> Kjetil Joergensen <kjetil@xxxxxxxxxxxx>
> SRE, Medallia Inc
> Phone: +1 (650) 739-6580

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux