Created a ticket to improve our testing here -- this appears to be a hole. http://tracker.ceph.com/issues/12742 -Sam On Thu, Aug 20, 2015 at 4:09 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: > So you started draining the cache pool before you saw either the > inconsistent pgs or the anomalous snap behavior? (That is, writeback > mode was working correctly?) > -Sam > > On Thu, Aug 20, 2015 at 4:07 PM, Voloshanenko Igor > <igor.voloshanenko@xxxxxxxxx> wrote: >> Good joke ))))))))) >> >> 2015-08-21 2:06 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >>> >>> Certainly, don't reproduce this with a cluster you care about :). >>> -Sam >>> >>> On Thu, Aug 20, 2015 at 4:02 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: >>> > What's supposed to happen is that the client transparently directs all >>> > requests to the cache pool rather than the cold pool when there is a >>> > cache pool. If the kernel is sending requests to the cold pool, >>> > that's probably where the bug is. Odd. It could also be a bug >>> > specific 'forward' mode either in the client or on the osd. Why did >>> > you have it in that mode? >>> > -Sam >>> > >>> > On Thu, Aug 20, 2015 at 3:58 PM, Voloshanenko Igor >>> > <igor.voloshanenko@xxxxxxxxx> wrote: >>> >> We used 4.x branch, as we have "very good" Samsung 850 pro in >>> >> production, >>> >> and they don;t support ncq_trim... >>> >> >>> >> And 4,x first branch which include exceptions for this in libsata.c. >>> >> >>> >> sure we can backport this 1 line to 3.x branch, but we prefer no to go >>> >> deeper if packege for new kernel exist. >>> >> >>> >> 2015-08-21 1:56 GMT+03:00 Voloshanenko Igor >>> >> <igor.voloshanenko@xxxxxxxxx>: >>> >>> >>> >>> root@test:~# uname -a >>> >>> Linux ix-s5 4.0.4-040004-generic #201505171336 SMP Sun May 17 17:37:22 >>> >>> UTC >>> >>> 2015 x86_64 x86_64 x86_64 GNU/Linux >>> >>> >>> >>> 2015-08-21 1:54 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >>> >>>> >>> >>>> Also, can you include the kernel version? >>> >>>> -Sam >>> >>>> >>> >>>> On Thu, Aug 20, 2015 at 3:51 PM, Samuel Just <sjust@xxxxxxxxxx> >>> >>>> wrote: >>> >>>> > Snapshotting with cache/tiering *is* supposed to work. Can you >>> >>>> > open a >>> >>>> > bug? >>> >>>> > -Sam >>> >>>> > >>> >>>> > On Thu, Aug 20, 2015 at 3:36 PM, Andrija Panic >>> >>>> > <andrija.panic@xxxxxxxxx> wrote: >>> >>>> >> This was related to the caching layer, which doesnt support >>> >>>> >> snapshooting per >>> >>>> >> docs...for sake of closing the thread. >>> >>>> >> >>> >>>> >> On 17 August 2015 at 21:15, Voloshanenko Igor >>> >>>> >> <igor.voloshanenko@xxxxxxxxx> >>> >>>> >> wrote: >>> >>>> >>> >>> >>>> >>> Hi all, can you please help me with unexplained situation... >>> >>>> >>> >>> >>>> >>> All snapshot inside ceph broken... >>> >>>> >>> >>> >>>> >>> So, as example, we have VM template, as rbd inside ceph. >>> >>>> >>> We can map it and mount to check that all ok with it >>> >>>> >>> >>> >>>> >>> root@test:~# rbd map >>> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5 >>> >>>> >>> /dev/rbd0 >>> >>>> >>> root@test:~# parted /dev/rbd0 print >>> >>>> >>> Model: Unknown (unknown) >>> >>>> >>> Disk /dev/rbd0: 10.7GB >>> >>>> >>> Sector size (logical/physical): 512B/512B >>> >>>> >>> Partition Table: msdos >>> >>>> >>> >>> >>>> >>> Number Start End Size Type File system Flags >>> >>>> >>> 1 1049kB 525MB 524MB primary ext4 boot >>> >>>> >>> 2 525MB 10.7GB 10.2GB primary lvm >>> >>>> >>> >>> >>>> >>> Than i want to create snap, so i do: >>> >>>> >>> root@test:~# rbd snap create >>> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >>> >>>> >>> >>> >>>> >>> And now i want to map it: >>> >>>> >>> >>> >>>> >>> root@test:~# rbd map >>> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >>> >>>> >>> /dev/rbd1 >>> >>>> >>> root@test:~# parted /dev/rbd1 print >>> >>>> >>> Warning: Unable to open /dev/rbd1 read-write (Read-only file >>> >>>> >>> system). >>> >>>> >>> /dev/rbd1 has been opened read-only. >>> >>>> >>> Warning: Unable to open /dev/rbd1 read-write (Read-only file >>> >>>> >>> system). >>> >>>> >>> /dev/rbd1 has been opened read-only. >>> >>>> >>> Error: /dev/rbd1: unrecognised disk label >>> >>>> >>> >>> >>>> >>> Even md5 different... >>> >>>> >>> root@ix-s2:~# md5sum /dev/rbd0 >>> >>>> >>> 9a47797a07fee3a3d71316e22891d752 /dev/rbd0 >>> >>>> >>> root@ix-s2:~# md5sum /dev/rbd1 >>> >>>> >>> e450f50b9ffa0073fae940ee858a43ce /dev/rbd1 >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> Ok, now i protect snap and create clone... but same thing... >>> >>>> >>> md5 for clone same as for snap,, >>> >>>> >>> >>> >>>> >>> root@test:~# rbd unmap /dev/rbd1 >>> >>>> >>> root@test:~# rbd snap protect >>> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >>> >>>> >>> root@test:~# rbd clone >>> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >>> >>>> >>> cold-storage/test-image >>> >>>> >>> root@test:~# rbd map cold-storage/test-image >>> >>>> >>> /dev/rbd1 >>> >>>> >>> root@test:~# md5sum /dev/rbd1 >>> >>>> >>> e450f50b9ffa0073fae940ee858a43ce /dev/rbd1 >>> >>>> >>> >>> >>>> >>> .... but it's broken... >>> >>>> >>> root@test:~# parted /dev/rbd1 print >>> >>>> >>> Error: /dev/rbd1: unrecognised disk label >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> ========= >>> >>>> >>> >>> >>>> >>> tech details: >>> >>>> >>> >>> >>>> >>> root@test:~# ceph -v >>> >>>> >>> ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3) >>> >>>> >>> >>> >>>> >>> We have 2 inconstistent pgs, but all images not placed on this >>> >>>> >>> pgs... >>> >>>> >>> >>> >>>> >>> root@test:~# ceph health detail >>> >>>> >>> HEALTH_ERR 2 pgs inconsistent; 18 scrub errors >>> >>>> >>> pg 2.490 is active+clean+inconsistent, acting [56,15,29] >>> >>>> >>> pg 2.c4 is active+clean+inconsistent, acting [56,10,42] >>> >>>> >>> 18 scrub errors >>> >>>> >>> >>> >>>> >>> ============ >>> >>>> >>> >>> >>>> >>> root@test:~# ceph osd map cold-storage >>> >>>> >>> 0e23c701-401d-4465-b9b4-c02939d57bb5 >>> >>>> >>> osdmap e16770 pool 'cold-storage' (2) object >>> >>>> >>> '0e23c701-401d-4465-b9b4-c02939d57bb5' -> pg 2.74458f70 (2.770) >>> >>>> >>> -> up >>> >>>> >>> ([37,15,14], p37) acting ([37,15,14], p37) >>> >>>> >>> root@test:~# ceph osd map cold-storage >>> >>>> >>> 0e23c701-401d-4465-b9b4-c02939d57bb5@snap >>> >>>> >>> osdmap e16770 pool 'cold-storage' (2) object >>> >>>> >>> '0e23c701-401d-4465-b9b4-c02939d57bb5@snap' -> pg 2.793cd4a3 >>> >>>> >>> (2.4a3) >>> >>>> >>> -> up >>> >>>> >>> ([12,23,17], p12) acting ([12,23,17], p12) >>> >>>> >>> root@test:~# ceph osd map cold-storage >>> >>>> >>> 0e23c701-401d-4465-b9b4-c02939d57bb5@test-image >>> >>>> >>> osdmap e16770 pool 'cold-storage' (2) object >>> >>>> >>> '0e23c701-401d-4465-b9b4-c02939d57bb5@test-image' -> pg >>> >>>> >>> 2.9519c2a9 >>> >>>> >>> (2.2a9) >>> >>>> >>> -> up ([12,44,23], p12) acting ([12,44,23], p12) >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> Also we use cache layer, which in current moment - in forward >>> >>>> >>> mode... >>> >>>> >>> >>> >>>> >>> Can you please help me with this.. As my brain stop to understand >>> >>>> >>> what is >>> >>>> >>> going on... >>> >>>> >>> >>> >>>> >>> Thank in advance! >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> >>>> >>> _______________________________________________ >>> >>>> >>> ceph-users mailing list >>> >>>> >>> ceph-users@xxxxxxxxxxxxxx >>> >>>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>>> >>> >>> >>>> >> >>> >>>> >> >>> >>>> >> >>> >>>> >> -- >>> >>>> >> >>> >>>> >> Andrija Panić >>> >>>> >> >>> >>>> >> _______________________________________________ >>> >>>> >> ceph-users mailing list >>> >>>> >> ceph-users@xxxxxxxxxxxxxx >>> >>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>>> >> >>> >>> >>> >>> >>> >> >> >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com