Re: RBD I/O errors with QEMU [luminous upgrade/osd change]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Since you have already upgraded to Luminous, the fastest and probably
easiest way to fix this is to run "ceph auth caps client.libvirt mon
'profile rbd' osd 'profile rbd pool=one'" [1]. Luminous provides
simplified RBD caps via named profiles which ensure all the correct
permissions are enabled.

[1] http://docs.ceph.com/docs/master/rados/operations/user-management/#authorization-capabilities

On Mon, Sep 11, 2017 at 4:56 PM, Nico Schottelius
<nico.schottelius@xxxxxxxxxxx> wrote:
>
> Hey Jason,
>
> here it is:
>
> [22:42:12] server4:~# ceph auth get client.libvirt
> exported keyring for client.libvirt
> [client.libvirt]
>         key = ...
>         caps mgr = "allow r"
>         caps mon = "allow r"
>         caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=one"
> [22:52:57] server4:~#
>
> p.s.: I am also available for online chat on
> https://brandnewchat.ungleich.ch/ in case you need more information quickly.
>
> Jason Dillaman <jdillama@xxxxxxxxxx> writes:
>
>> I see the following which is most likely the issue:
>>
>> 2017-09-11 22:26:38.945776 7efd677fe700 -1
>> librbd::managed_lock::BreakRequest: 0x7efd58020e70 handle_blacklist:
>> failed to blacklist lock owner: (13) Permission denied
>> 2017-09-11 22:26:38.945795 7efd677fe700 10
>> librbd::managed_lock::BreakRequest: 0x7efd58020e70 finish: r=-13
>> 2017-09-11 22:26:38.945798 7efd677fe700 10
>> librbd::managed_lock::AcquireRequest: 0x7efd60017960
>> handle_break_lock: r=-13
>> 2017-09-11 22:26:38.945800 7efd677fe700 -1
>> librbd::managed_lock::AcquireRequest: 0x7efd60017960
>> handle_break_lock: failed to break lock : (13) Permission denied
>> 2017-09-11 22:26:38.945865 7efd677fe700 10 librbd::ManagedLock:
>> 0x7efd580267d0 handle_acquire_lock: r=-13
>> 2017-09-11 22:26:38.945873 7efd677fe700 -1 librbd::ManagedLock:
>> 0x7efd580267d0 handle_acquire_lock: failed to acquire exclusive
>> lock:(13) Permission denied
>> 2017-09-11 22:26:38.945883 7efd677fe700 10 librbd::ExclusiveLock:
>> 0x7efd580267d0 post_acquire_lock_handler: r=-13
>> 2017-09-11 22:26:38.945887 7efd677fe700 10 librbd::ImageState:
>> 0x55b55ace8dc0 handle_prepare_lock_complete
>> 2017-09-11 22:26:38.945892 7efd677fe700 10 librbd::ManagedLock:
>> 0x7efd580267d0 handle_post_acquire_lock: r=-13
>> 2017-09-11 22:26:38.945895 7efd677fe700  5 librbd::io::ImageRequestWQ:
>> 0x55b55ace9a20 handle_acquire_lock: r=-13, req=0x55b55add32a0
>> 2017-09-11 22:26:38.945901 7efd677fe700 -1 librbd::io::AioCompletion:
>> 0x55b55add46a0 fail: (13) Permission denied
>>
>> It looks like your "client.libvirt" user lacks the permission to
>> blacklist a dead client that had previously acquired the exclusive
>> lock and failed to release it.
>>
>> Can you provide the results from "ceph auth get client.libvirt"? I
>> suspect it only has 'caps mon = "allow r"'.
>>
>> On Mon, Sep 11, 2017 at 4:45 PM, Nico Schottelius
>> <nico.schottelius@xxxxxxxxxxx> wrote:
>>>
>>>
>>> Thanks a lot for the great ceph.conf pointer, Mykola!
>>>
>>> I found something interesting:
>>>
>>> 2017-09-11 22:26:23.418796 7efd7d479700 10 client.1039597.objecter ms_dispatch 0x55b55ab8f950 osd_op_reply(4 rbd_header.df7343d1b58ba [call] v0'0 uv0 ondisk = -8 ((8) Exec format error)) v8
>>> 2017-09-11 22:26:23.439501 7efd7dc7a700 10 client.1039597.objecter
>>> ms_dispatch 0x55b55ab8f950 osd_op_reply(14 rbd_header.2b0c02ae8944a
>>> [call] v0'0 uv0 ondisk = -8 ((8) Exec format error)) v8
>>>
>>> Not sure if those are the ones causing the problem, but at least some
>>> error.
>>>
>>> I have uploaded the log at
>>> http://www.nico.schottelius.org/ceph.client.libvirt.41670.log.bz2
>>>
>>> I wonder if anyone sees the real reason for the I/O errors in the log?
>>>
>>> Best,
>>>
>>> Nico
>>>
>>>> Mykola Golub <mgolub@xxxxxxxxxxxx> writes:
>>>>
>>>>> On Sun, Sep 10, 2017 at 03:56:21PM +0200, Nico Schottelius wrote:
>>>>>>
>>>>>> Just tried and there is not much more log in ceph -w (see below) neither
>>>>>> from the qemu process.
>>>>>>
>>>>>> [15:52:43] server4:~$  /usr/bin/qemu-system-x86_64 -name one-17031 -S
>>>>>> -machine pc-i440fx-2.1,accel=kvm,usb=off -m 8192 -realtime mlock=off
>>>>>> -smp 6,sockets=6,cores=1,threads=1 -uuid
>>>>>> 79845fca-9b26-4072-bcb3-7f5206c2a531 -no-user-config -nodefaults
>>>>>> -chardev
>>>>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-17031.monitor,server,nowait
>>>>>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
>>>>>> -no-shutdown -boot strict=on -device
>>>>>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
>>>>>> file='rbd:one/one-29-17031-0:id=libvirt:key=DELETEME:auth_supported=cephx\;none:mon_host=server1\:6789\;server3\:6789\;server5\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=none' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/one//datastores/100/17031/disk.1,if=none,id=drive-ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -vnc [::]:21131 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on 2>&1 | tee kvmlogwithdebug
>>>>>>
>>>>>> -> no output
>>>>>
>>>>> Try to find where the qemu process writes the ceph log, e.g. with the
>>>>> help of lsof utility. Or add something like below
>>>>>
>>>>>  log file = /tmp/ceph.$name.$pid.log
>>>>>
>>>>> to ceph.conf before starting qemu and look for /tmp/ceph.*.log
>>>
>>>
>>> --
>>> Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
>
>
> --
> Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch



-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux