Re: Unable to deactivate lv, pehaps due to semaphore problem...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 27, 2014 at 3:33 PM, Zdenek Kabelac <zkabelac@xxxxxxxxxx> wrote:
Dne 27.11.2014 v 15:26 Gianluca Cecchi napsal(a):
Hello,
I'm unable to deactivate an lvm.

My system is RHEL 6.5 with lvm2-2.02.100-8.el6.x86_64 and kernel
2.6.32-431.29.2.el6.x86_64

I get error code 5 with message
   Logical volume VG_AAA_TEMP/LV_AAA_TEMP in use.

You can find output of
lvchange -d -d -d -d -d -d -an VG_AAA_TEMP/LV_AAA_TEMP
here:
https://drive.google.com/file/d/0BwoPbcrMv8mvTjlBMkRUbG9nczA/view?usp=sharing


Not really accessible.

strange, do you mean the google docs link?
I tried with a browser without access to any gmail account and I'm able to download it....
   

But anyway - if you have problem with  'semaphore' resouces - you could 'recycle' old ones -

'dmsetup  udevcomplete_all'

This is actually a production server with many other LVs... Is there any drawback in the command above?
 

Of course it's hard to guess what experiments are you doing and would could lead to uncompleted cockies (stuck udev scans)

Actually no experiment at all.
The node is part of a rhel 2-nodes production cluster with HA_LVM based services.
We need to relocate many services to the other node for a planned maintenance, but it seems that this one is able to stop the lvm resources, but not cleanly deactivate the LVs. We get messages like

Nov 26 17:35:29 orapr2 rgmanager[5765]: [lvm] Deactivating VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:29 orapr2 rgmanager[5786]: [lvm] Making resilient : lvchange -an VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:29 orapr2 rgmanager[5809]: [lvm] Resilient command: lvchange -an VG_AAA_TEMP/LV_AAA_TEMP --config devices{filter=["a|/dev/ma
pper/360a9800037543544465d424
Nov 26 17:35:34 orapr2 rgmanager[5883]: [lvm] lv_exec_resilient failed
Nov 26 17:35:34 orapr2 rgmanager[5908]: [lvm] lv_activate_resilient stop failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5928]: [lvm] Unable to deactivate VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5948]: [lvm] Failed to stop VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5968]: [lvm] Attempting cleanup of VG_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[5989]: [lvm] VG_AAA_TEMP now consistent
Nov 26 17:35:34 orapr2 rgmanager[6013]: [lvm] Deactivating VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:34 orapr2 rgmanager[6033]: [lvm] Making resilient : lvchange -an VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:35 orapr2 rgmanager[6056]: [lvm] Resilient command: lvchange -an VG_AAA_TEMP/LV_AAA_TEMP --config devices{filter=["a|/dev/ma
pper/360a9800037543544465d424
Nov 26 17:35:39 orapr2 rgmanager[6648]: [lvm] lv_exec_resilient failed
Nov 26 17:35:40 orapr2 rgmanager[6670]: [lvm] lv_activate_resilient stop failed on VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[6690]: [lvm] Unable to deactivate VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[6710]: [lvm] Failed second attempt to stop VG_AAA_TEMP/LV_AAA_TEMP
Nov 26 17:35:40 orapr2 rgmanager[20260]: stop on lvm "LV_AAA_TEMP" returned 1 (generic error)
Nov 26 17:35:40 orapr2 rgmanager[20260]: Marking service:AAA as 'disabled', but some resources may still be allocated!
Nov 26 17:35:40 orapr2 rgmanager[20260]: Service service:AAA is disabled

And of course the other node then is unable to activate the service due to LV maintained open from the first one:

Nov 26 17:35:40 orapr1 rgmanager[18596]: Starting disabled service service:AAA
Nov 26 17:35:41 orapr1 rgmanager[31420]: [lvm] Someone else owns this logical volume
Nov 26 17:35:41 orapr1 rgmanager[18596]: start on lvm "LV_AAA_TEMP" returned 1 (generic error)
Nov 26 17:35:41 orapr1 rgmanager[18596]: #68: Failed to start service:AAA; return value: 1

So I'm trying to reproduce the cluster command to see how to clean the situation, using this particular service (named AAA) that is not critical as the other ones running on the node   


Do you happen to have some suspend devices in your table ?
(dmsetup info -c    should show them)

It seems not so. Only (L)ive states...

[root@orapr2 ~]# dmsetup info -c | awk '{print $4}' | sort | uniq -c
     77 L--w
      1 Stat

 


[root@orapr2 ~]# lvs VG_AAA_TEMP/LV_AAA_TEMP
   LV          VG          Attr       LSize    Pool Origin Data%  Move Log
Cpy%Sync Convert
   LV_AAA_TEMP VG_AAA_TEMP -wi-ao---- 1020.00m

How can I see the responsible for the reference that apparently keeps it open?

Open count:        1
so I can check and eventually fix??


dmsetup ls --tree

is usually good in shows deps between devs (i.e.  target A holds target B)

Regards

Zdenek



it returns no particular output related
...
 VG_AAA_TEMP-LV_AAA_TEMP (253:49)
 └─360a9800037543544465d424130533177 (253:4)
    ├─ (130:128)
    ├─ (129:32)
    ├─ (68:48)
    ├─ (8:96)
    ├─ (8:288)
    ├─ (133:224)
    ├─ (69:192)
    └─ (66:160)
...

BTW: I'm testing this one but it seems that the problem is general, in the sense that each LV gets this kind of behaviour trying to deactivating it...

Thanks in advance for any other insight and let me know if I can send it the debug log of lvchange command in case you are not yet able to access it...

Gianluca
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux