Re: [urgent] KVM issues after upgrade to 0.94.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is an edge case with cloned image writeback caching that occurs after an attempt to read a non-existent clone RADOS object, followed by a write to said object, followed by another read.  This second read will cause the cached write to be flushed to the OSD while the appropriate locks are not being held.  This issue is being tracked via an upstream tracker ticket [1].

This issue effects librbd clients using v0.94.4 and v9.x.  Disabling the cache or switching to write-through caching (rbd_cache_max_dirty = 0) should avoid the issue until it is fixed in the next Ceph release.

[1] http://tracker.ceph.com/issues/13559

-- 

Jason Dillaman 


----- Original Message ----- 

> From: "Andrei Mikhailovsky" <andrei@xxxxxxxxxx>
> To: ceph-users@xxxxxxxx
> Sent: Wednesday, October 21, 2015 8:17:39 AM
> Subject:  [urgent] KVM issues after upgrade to 0.94.4

> Hello guys,

> I've upgraded to the latest Hammer release and I've just noticed a massive
> issue after the upgrade (((

> I am using ceph for virtual machine rbd storage over cloudstack. I am having
> issues with starting virtual routers. The libvirt error message is:

> cat r-1407-VM.log
> 2015-10-21 11:04:59.262+0000: starting up
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
> QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name r-1407-VM -S -machine
> pc-i440fx-trusty,accel=kvm,usb=off -m 256 -realtime mlock=off -smp
> 1,sockets=1,cores=1,threads=1 -uuid 815d2860-cc7f-475d-bf63-02814c720fe4
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/r-1407-VM.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
> -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive
> file=rbd:Primary-ubuntu-1/c3f90fb4-c1a6-4e99-a2c0-64ae4517412e:id=admin:key=AQDiDbJR2GqPABAAWCcsUQ+UQwK8z9c6LWrizw==:auth_supported=cephx\;none:mon_host=ceph-mon.csprdc.arhont.com\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=none
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2
> -drive
> file=/usr/share/cloudstack-common/vms/systemvm.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none
> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1
> -netdev tap,fd=54,id=hostnet0,vhost=on,vhostfd=55 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:2e:f7:00:18,bus=pci.0,addr=0x3,rombar=0,romfile=
> -netdev tap,fd=56,id=hostnet1,vhost=on,vhostfd=57 -device
> virtio-net-pci,netdev=hostnet1,id=net1,mac=0e:00:a9:fe:01:42,bus=pci.0,addr=0x4,rombar=0,romfile=
> -netdev tap,fd=58,id=hostnet2,vhost=on,vhostfd=59 -device
> virtio-net-pci,netdev=hostnet2,id=net2,mac=06:0c:b6:00:02:13,bus=pci.0,addr=0x5,rombar=0,romfile=
> -chardev pty,id=charserial0 -device
> isa-serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/r-1407-VM.agent,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=r-1407-VM.vport
> -device usb-tablet,id=input0 -vnc 192.168.169.2:10,password -device
> cirrus-vga,id=video0,bus=pci.0,addr=0x2
> Domain id=42 is tainted: high-privileges
> libust[20136/20136]: Warning: HOME environment variable not set. Disabling
> LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305)
> char device redirected to /dev/pts/13 (label charserial0)
> librbd/LibrbdWriteback.cc: In function 'virtual ceph_tid_t
> librbd::LibrbdWriteback::write(const object_t&, const object_locator_t&,
> uint64_t, uint64_t, const SnapContext&, const bufferlist&, utime_t,
> uint64_t, __u32, Context*)' thread 7ffa6b7fe700 time 2015-10-21
> 12:05:07.901876
> librbd/LibrbdWriteback.cc: 160: FAILED assert(m_ictx->owner_lock.is_locked())
> ceph version 0.94.4 (95292699291242794510b39ffde3f4df67898d3a)
> 1: (()+0x17258b) [0x7ffa92ef758b]
> 2: (()+0xa9573) [0x7ffa92e2e573]
> 3: (()+0x3a90ca) [0x7ffa9312e0ca]
> 4: (()+0x3b583d) [0x7ffa9313a83d]
> 5: (()+0x7212c) [0x7ffa92df712c]
> 6: (()+0x9590f) [0x7ffa92e1a90f]
> 7: (()+0x969a3) [0x7ffa92e1b9a3]
> 8: (()+0x4782a) [0x7ffa92dcc82a]
> 9: (()+0x56599) [0x7ffa92ddb599]
> 10: (()+0x7284e) [0x7ffa92df784e]
> 11: (()+0x162b7e) [0x7ffa92ee7b7e]
> 12: (()+0x163c10) [0x7ffa92ee8c10]
> 13: (()+0x8182) [0x7ffa8ec49182]
> 14: (clone()+0x6d) [0x7ffa8e97647d]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
> terminate called after throwing an instance of 'ceph::FailedAssertion'
> 2015-10-21 11:05:08.091+0000: shutting down

> From what I can see, there seem to be an issue with locking
> (librbd/LibrbdWriteback.cc: 160: FAILED
> assert(m_ictx->owner_lock.is_locked())). However, the r-1407-VM virtual
> router is a new router and has not been created or ran before. So, I don't
> see why there is an issue with locking.

> Could someone please help me determine the cause of the error and how to fix
> it. I've not seen this on 0.94.1.

> Many thanks

> Andrei

> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux