Re: [ovirt-users] open error -13 = sanlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Hi guys,
thx a lot for your support first.

Because we had been under huge time pressure, we found "google workaround"  which delete both files . It helped, probabbly at first steps of recover .
eg: " #  find /STORAGES/g1r5p5/GFS/ -samefile /STORAGES/g1r5p5/GFS/3da46e07-d1ea-4f10-9250-6cbbb7b94d80/dom_md/ids -print -delete "

Well at first I'll  fix permittions from mount points  to 660 .
If "ids"  file will be writeable , can't  became gluster colaps ??


On 2.3.2016 08:16, Ravishankar N wrote:
On 03/02/2016 12:02 PM, Sahina Bose wrote:

On 03/02/2016 03:45 AM, Nir Soffer wrote:
On Tue, Mar 1, 2016 at 10:51 PM, paf1@xxxxxxxx <paf1@xxxxxxxx> wrote:
> HI,
> requested output:
> # ls -lh /rhev/data-center/mnt/glusterSD/localhost:*/*/dom_md
> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-BCK/0fcad888-d573-47be-bef3-0bc0b7a99fb7/dom_md:
> total 2,1M
> -rw-rw---- 1 vdsm kvm 1,0M  1. bře 21.28 ids        <-- good
> -rw-rw---- 1 vdsm kvm  16M  7. lis 22.16 inbox
> -rw-rw---- 1 vdsm kvm 2,0M  7. lis 22.17 leases
> -rw-r--r-- 1 vdsm kvm  335  7. lis 22.17 metadata
> -rw-rw---- 1 vdsm kvm  16M  7. lis 22.16 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P1/553d9b92-e4a0-4042-a579-4cabeb55ded4/dom_md:
> total 1,1M
> -rw-r--r-- 1 vdsm kvm    0 24. úno 07.41 ids        <-- bad (sanlock cannot write, other can read)
> -rw-rw---- 1 vdsm kvm  16M  7. lis 00.14 inbox
> -rw-rw---- 1 vdsm kvm 2,0M  7. lis 03.56 leases
> -rw-r--r-- 1 vdsm kvm  333  7. lis 03.56 metadata
> -rw-rw---- 1 vdsm kvm  16M  7. lis 00.14 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P2/88adbd49-62d6-45b1-9992-b04464a04112/dom_md:
> total 1,1M
> -rw-r--r-- 1 vdsm kvm    0 24. úno 07.43 ids        <-- bad (sanlock cannot write, other can read)
> -rw-rw---- 1 vdsm kvm  16M  7. lis 00.15 inbox
> -rw-rw---- 1 vdsm kvm 2,0M  7. lis 22.14 leases
> -rw-r--r-- 1 vdsm kvm  333  7. lis 22.14 metadata
> -rw-rw---- 1 vdsm kvm  16M  7. lis 00.15 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P3/3c34ad63-6c66-4e23-ab46-084f3d70b147/dom_md:
> total 1,1M
> -rw-r--r-- 1 vdsm kvm    0 24. úno 07.43 ids        <-- bad (sanlock cannot write, other can read)
> -rw-rw---- 1 vdsm kvm  16M 23. úno 22.51 inbox
> -rw-rw---- 1 vdsm kvm 2,0M 23. úno 23.12 leases
> -rw-r--r-- 1 vdsm kvm  998 25. úno 00.35 metadata
> -rw-rw---- 1 vdsm kvm  16M  7. lis 00.16 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md:
> total 1,1M
> -rw-r--r-- 1 vdsm kvm    0 24. úno 07.44 ids        <-- bad (sanlock cannot write, other can read)
> -rw-rw---- 1 vdsm kvm  16M  7. lis 00.17 inbox
> -rw-rw---- 1 vdsm kvm 2,0M  7. lis 00.18 leases
> -rw-r--r-- 1 vdsm kvm  333  7. lis 00.18 metadata
> -rw-rw---- 1 vdsm kvm  16M  7. lis 00.17 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P1/42d710a9-b844-43dc-be41-77002d1cd553/dom_md:
> total 1,1M
> -rw-rw-r-- 1 vdsm kvm    0 24. úno 07.32 ids        <-- bad (other can read)
> -rw-rw---- 1 vdsm kvm  16M  7. lis 22.18 inbox
> -rw-rw---- 1 vdsm kvm 2,0M  7. lis 22.18 leases
> -rw-r--r-- 1 vdsm kvm  333  7. lis 22.18 metadata
> -rw-rw---- 1 vdsm kvm  16M  7. lis 22.18 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P2/ff71b47b-0f72-4528-9bfe-c3da888e47f0/dom_md:
> total 3,0M
> -rw-rw-r-- 1 vdsm kvm 1,0M  1. bře 21.28 ids        <-- bad (other can read)
> -rw-rw---- 1 vdsm kvm  16M 25. úno 00.42 inbox 
> -rw-rw---- 1 vdsm kvm 2,0M 25. úno 00.44 leases
> -rw-r--r-- 1 vdsm kvm  997 24. úno 02.46 metadata
> -rw-rw---- 1 vdsm kvm  16M 25. úno 00.44 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P3/ef010d08-aed1-41c4-ba9a-e6d9bdecb4b4/dom_md:
> total 2,1M
> -rw-r--r-- 1 vdsm kvm    0 24. úno 07.34 ids        <-- bad (sanlock cannot write, other can read)
> -rw-rw---- 1 vdsm kvm  16M 23. úno 22.35 inbox
> -rw-rw---- 1 vdsm kvm 2,0M 23. úno 22.38 leases
> -rw-r--r-- 1 vdsm kvm 1,1K 24. úno 19.07 metadata
> -rw-rw---- 1 vdsm kvm  16M 23. úno 22.27 outbox
> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12__P4/300e9ac8-3c2f-4703-9bb1-1df2130c7c97/dom_md:
> total 3,0M
> -rw-rw-r-- 1 vdsm kvm 1,0M  1. bře 21.28 ids        <-- bad (other can read)
> -rw-rw-r-- 1 vdsm kvm  16M  6. lis 23.50 inbox        <-- bad (other can read)
> -rw-rw-r-- 1 vdsm kvm 2,0M  6. lis 23.51 leases        <-- bad (other can read)
> -rw-rw-r-- 1 vdsm kvm  734  7. lis 02.13 metadata        <-- bad (group can write, other can read)
> -rw-rw-r-- 1 vdsm kvm  16M  6. lis 16.55 outbox        <-- bad (other can read)
> /rhev/data-center/mnt/glusterSD/localhost:_2KVM12-P5/1ca56b45-701e-4c22-9f59-3aebea4d8477/dom_md:
> total 1,1M
> -rw-rw-r-- 1 vdsm kvm    0 24. úno 07.35 ids        <-- bad (other can read)
> -rw-rw-r-- 1 vdsm kvm  16M 24. úno 01.06 inbox
> -rw-rw-r-- 1 vdsm kvm 2,0M 24. úno 02.44 leases
> -rw-r--r-- 1 vdsm kvm  998 24. úno 19.07 metadata
> -rw-rw-r-- 1 vdsm kvm  16M  7. lis 22.20 outbox

It should look like this:

-rw-rw----. 1 vdsm kvm 1.0M Mar  1 23:36 ids
-rw-rw----. 1 vdsm kvm 2.0M Mar  1 23:35 leases
-rw-r--r--. 1 vdsm kvm  353 Mar  1 23:35 metadata
-rw-rw----. 1 vdsm kvm  16M Mar  1 23:34 outbox
-rw-rw----. 1 vdsm kvm  16M Mar  1 23:34 inbox

This explains the EACCES error.

You can start by fixing the permissions manually, you can do this online.
>  The ids files was generated by "touch" command after deleting them due "sanlock locking hang"  gluster crash & reboot
> I expected that they will be filled automaticaly after gluster reboot ( the  shadow copy from   ".gluster "   directory  was deleted & created empty  too )

I don't know about gluster shadow copy, I would not play with gluster internals.
Adding Sahina for advice.

Did you generate the ids file on the mount point.

Ravi, can you help here?

Okay, so what I understand from the output above is you have different gluster volumes mounted and some of them have incorrect permissions for the 'ids' file. The way to fix it is to do it from the mount like Nir said.
Why did you delete the file from the .glusterfs in the brick(s)?  Was there a gfid split brain?


> OK, it looks that sanlock  can't work with empty file or rewrite them .
> Am I right ??

Yes, the files must be initialized before sanlock can use them.

You can initialize the file like this:

sanlock direct init -s <sd_uuid>:0:repair/<sd_uuid>/dom_md/ids:0

Taken from

> The last point - about "ids" workaround - this is offline version = VMs have to be moved out from for continual running with maintenance volume mode
> But this is not acceptable in current situation, so the question again,  is it safe to do it online ??  ( YES / NO )

The ids file is accessed only by sanlock. I guess that you don't have a running
SPM on this DC, since sanlock fails to acquire a host id, so you are pretty safe
to fix the permissions and initialize the ids files.

I would do this:

1. Stop engine,  so it will not try to start vdsm
2. Stop vdsm on all hosts, so they do not try to acquire a host id with sanlock
    This does not affect running vms
3. Fix the permissions on the ids file, via glusterfs mount
4. Initialize the ids files from one of the hosts, via the glusterfs mount
    This should fix the ids files on all replicas
5. Start vdsm on all hosts
6. Start engine

Engine will connect to all hosts, hosts will connect to storage and try to acquire a host id.
Then Engine will start the SPM on one of the hosts, and your DC should become up.

David, Sahina, can you confirm that this procedure is safe?

Yes, correcting from the mount point should fix it on all replicas


> regs.
> Pavel
> On 1.3.2016 18:38, Nir Soffer wrote:
> On Tue, Mar 1, 2016 at 5:07 PM, paf1@xxxxxxxx <paf1@xxxxxxxx> wrote:
>> Hello,  can anybody  explain this error no.13 ( open file ) in sanlock.log .
> This is EACCES
> Can you share the outoput of:
>     ls -lh /rhev/data-center/mnt/<server>:<_path>/<sd_uuid>/dom_md
>> The size of  "ids" file is zero (0)
> This is how we create the ids file when initializing it.
> But then we use sanlock to initialize the ids file, and it should be 1MiB after that.
> Is this ids files created by vdsm, or one you created yourself?
>> 2016-02-28 03:25:46+0100 269626 [1951]: open error -13 /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids
>> 2016-02-28 03:25:46+0100 269626 [1951]: s187985 open_disk /rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids error -13
>> 2016-02-28 03:25:56+0100 269636 [11304]: s187992 lockspace 7f52b697-c199-4f58-89aa-102d44327124:1:/rhev/data-center/mnt/glusterSD/localhost:_1KVM12-P4/7f52b697-c199-4f58-89aa-102d44327124/dom_md/ids:0
>> If the main problem is about zero file size, can I regenerate  this file online securely , with no VM dependence  ????
> Yes, I think I already referred to the instructions how to do that in a previous mail.
>> dist = RHEL - 7 - 2.1511
>> kernel = 3.10.0 - 327.10.1.el7.x86_64
>> KVM = 2.3.0 - 29.1.el7
>> libvirt = libvirt-1.2.17-13.el7_2.3
>> vdsm = vdsm-4.16.30-0.el7
>> GlusterFS = glusterfs-3.7.8-1.el7
>> regs.
>> Pavel
>> _______________________________________________
>> Users mailing list
>> Users@xxxxxxxxx

Gluster-users mailing list

Gluster-users mailing list

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux