On Tue, Nov 17, 2020 at 11:52:28AM +0800, heming.zhao@xxxxxxxx wrote: > In lvm functions, it treats "return 0" as error case. > > if _lvchange_activate() return ECMD_FAILED, the caller _lvchange_activate_single() think as normal: > ``` > if (!_lvchange_activate(cmd, lv)) <== ECMD_FAILED is 5, won't enter if case. > return_ECMD_FAILED; Thanks for finding that. In some places 0 is the error value and in other places ECMD_FAILED is the error value; they frequently get mixed up. I believe this is the bug you are seeing: https://sourceware.org/git/?p=lvm2.git;a=commit;h=aba9652e584b6f6a422233dea951eb59326a3de2 > 2. node2 change the systemid to itself > > ``` > [tb-clustermd2 ~]# vgchange -y --config "local/extra_system_ids='tb-clustermd1'" --systemid tb-clustermd2 vg1 > Volume group "vg1" successfully changed > [tb-clustermd2 ~]# lvchange -ay vg1/lv1 > [tb-clustermd2 ~]# dmsetup ls > vg1-lv1 (254:0) This is what the LVM-activate resource agent does, except it wouldn't be done while the LV is active on another running host. Just wanted to clarify that, I don't think it's the point of your illustration here. > 3. this time both sides have dm device. > ``` > [tb-clustermd1 ~]# dmsetup ls > vg1-lv1 (254:0) > [tb-clustermd2 ~]# dmsetup ls > vg1-lv1 (254:0) > ``` For the sake of anyone looking at this later, this shouldn't happen in a properly running cluster. (If you wanted the LV active on two hosts at once, you'd use lvmlockd and no system ID on the VG.) > 4. node1 executes lvchange cmds. please note the return value is 0 > ``` > [tb-clustermd1 ~]# lvchange -ay vg1/lv1 ; echo $? > WARNING: Found LVs active in VG vg1 with foreign system ID tb-clustermd2. Possible data corruption. > Cannot activate LVs in a foreign VG. > 0 That's the one fixed by the commit above. Dave _______________________________________________ linux-lvm mailing list linux-lvm@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/