Yeah, I'm trying to confirm that the issues did happen in writeback mode. -Sam On Thu, Aug 20, 2015 at 4:21 PM, Voloshanenko Igor <igor.voloshanenko@xxxxxxxxx> wrote: > Right. But issues started... > > 2015-08-21 2:20 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> But that was still in writeback mode, right? >> -Sam >> >> On Thu, Aug 20, 2015 at 4:18 PM, Voloshanenko Igor >> <igor.voloshanenko@xxxxxxxxx> wrote: >> > WE haven't set values for max_bytes / max_objects.. and all data >> > initially >> > writes only to cache layer and not flushed at all to cold layer. >> > >> > Then we received notification from monitoring that we collect about >> > 750GB in >> > hot pool ) So i changed values for max_object_bytes to be 0,9 of disk >> > size... And then evicting/flushing started... >> > >> > And issue with snapshots arrived >> > >> > 2015-08-21 2:15 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> Not sure what you mean by: >> >> >> >> but it's stop to work in same moment, when cache layer fulfilled with >> >> data and evict/flush started... >> >> -Sam >> >> >> >> On Thu, Aug 20, 2015 at 4:11 PM, Voloshanenko Igor >> >> <igor.voloshanenko@xxxxxxxxx> wrote: >> >> > No, when we start draining cache - bad pgs was in place... >> >> > We have big rebalance (disk by disk - to change journal side on both >> >> > hot/cold layers).. All was Ok, but after 2 days - arrived scrub >> >> > errors >> >> > and 2 >> >> > pgs inconsistent... >> >> > >> >> > In writeback - yes, looks like snapshot works good. but it's stop to >> >> > work in >> >> > same moment, when cache layer fulfilled with data and evict/flush >> >> > started... >> >> > >> >> > >> >> > >> >> > 2015-08-21 2:09 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> >> >> So you started draining the cache pool before you saw either the >> >> >> inconsistent pgs or the anomalous snap behavior? (That is, >> >> >> writeback >> >> >> mode was working correctly?) >> >> >> -Sam >> >> >> >> >> >> On Thu, Aug 20, 2015 at 4:07 PM, Voloshanenko Igor >> >> >> <igor.voloshanenko@xxxxxxxxx> wrote: >> >> >> > Good joke ))))))))) >> >> >> > >> >> >> > 2015-08-21 2:06 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> >> >> >> >> Certainly, don't reproduce this with a cluster you care about :). >> >> >> >> -Sam >> >> >> >> >> >> >> >> On Thu, Aug 20, 2015 at 4:02 PM, Samuel Just <sjust@xxxxxxxxxx> >> >> >> >> wrote: >> >> >> >> > What's supposed to happen is that the client transparently >> >> >> >> > directs >> >> >> >> > all >> >> >> >> > requests to the cache pool rather than the cold pool when there >> >> >> >> > is >> >> >> >> > a >> >> >> >> > cache pool. If the kernel is sending requests to the cold >> >> >> >> > pool, >> >> >> >> > that's probably where the bug is. Odd. It could also be a bug >> >> >> >> > specific 'forward' mode either in the client or on the osd. >> >> >> >> > Why >> >> >> >> > did >> >> >> >> > you have it in that mode? >> >> >> >> > -Sam >> >> >> >> > >> >> >> >> > On Thu, Aug 20, 2015 at 3:58 PM, Voloshanenko Igor >> >> >> >> > <igor.voloshanenko@xxxxxxxxx> wrote: >> >> >> >> >> We used 4.x branch, as we have "very good" Samsung 850 pro in >> >> >> >> >> production, >> >> >> >> >> and they don;t support ncq_trim... >> >> >> >> >> >> >> >> >> >> And 4,x first branch which include exceptions for this in >> >> >> >> >> libsata.c. >> >> >> >> >> >> >> >> >> >> sure we can backport this 1 line to 3.x branch, but we prefer >> >> >> >> >> no >> >> >> >> >> to >> >> >> >> >> go >> >> >> >> >> deeper if packege for new kernel exist. >> >> >> >> >> >> >> >> >> >> 2015-08-21 1:56 GMT+03:00 Voloshanenko Igor >> >> >> >> >> <igor.voloshanenko@xxxxxxxxx>: >> >> >> >> >>> >> >> >> >> >>> root@test:~# uname -a >> >> >> >> >>> Linux ix-s5 4.0.4-040004-generic #201505171336 SMP Sun May 17 >> >> >> >> >>> 17:37:22 >> >> >> >> >>> UTC >> >> >> >> >>> 2015 x86_64 x86_64 x86_64 GNU/Linux >> >> >> >> >>> >> >> >> >> >>> 2015-08-21 1:54 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> >>>> >> >> >> >> >>>> Also, can you include the kernel version? >> >> >> >> >>>> -Sam >> >> >> >> >>>> >> >> >> >> >>>> On Thu, Aug 20, 2015 at 3:51 PM, Samuel Just >> >> >> >> >>>> <sjust@xxxxxxxxxx> >> >> >> >> >>>> wrote: >> >> >> >> >>>> > Snapshotting with cache/tiering *is* supposed to work. >> >> >> >> >>>> > Can >> >> >> >> >>>> > you >> >> >> >> >>>> > open a >> >> >> >> >>>> > bug? >> >> >> >> >>>> > -Sam >> >> >> >> >>>> > >> >> >> >> >>>> > On Thu, Aug 20, 2015 at 3:36 PM, Andrija Panic >> >> >> >> >>>> > <andrija.panic@xxxxxxxxx> wrote: >> >> >> >> >>>> >> This was related to the caching layer, which doesnt >> >> >> >> >>>> >> support >> >> >> >> >>>> >> snapshooting per >> >> >> >> >>>> >> docs...for sake of closing the thread. >> >> >> >> >>>> >> >> >> >> >> >>>> >> On 17 August 2015 at 21:15, Voloshanenko Igor >> >> >> >> >>>> >> <igor.voloshanenko@xxxxxxxxx> >> >> >> >> >>>> >> wrote: >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Hi all, can you please help me with unexplained >> >> >> >> >>>> >>> situation... >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> All snapshot inside ceph broken... >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> So, as example, we have VM template, as rbd inside ceph. >> >> >> >> >>>> >>> We can map it and mount to check that all ok with it >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> root@test:~# rbd map >> >> >> >> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5 >> >> >> >> >>>> >>> /dev/rbd0 >> >> >> >> >>>> >>> root@test:~# parted /dev/rbd0 print >> >> >> >> >>>> >>> Model: Unknown (unknown) >> >> >> >> >>>> >>> Disk /dev/rbd0: 10.7GB >> >> >> >> >>>> >>> Sector size (logical/physical): 512B/512B >> >> >> >> >>>> >>> Partition Table: msdos >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Number Start End Size Type File system >> >> >> >> >>>> >>> Flags >> >> >> >> >>>> >>> 1 1049kB 525MB 524MB primary ext4 >> >> >> >> >>>> >>> boot >> >> >> >> >>>> >>> 2 525MB 10.7GB 10.2GB primary >> >> >> >> >>>> >>> lvm >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Than i want to create snap, so i do: >> >> >> >> >>>> >>> root@test:~# rbd snap create >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> And now i want to map it: >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> root@test:~# rbd map >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >> >> >> >> >>>> >>> /dev/rbd1 >> >> >> >> >>>> >>> root@test:~# parted /dev/rbd1 print >> >> >> >> >>>> >>> Warning: Unable to open /dev/rbd1 read-write (Read-only >> >> >> >> >>>> >>> file >> >> >> >> >>>> >>> system). >> >> >> >> >>>> >>> /dev/rbd1 has been opened read-only. >> >> >> >> >>>> >>> Warning: Unable to open /dev/rbd1 read-write (Read-only >> >> >> >> >>>> >>> file >> >> >> >> >>>> >>> system). >> >> >> >> >>>> >>> /dev/rbd1 has been opened read-only. >> >> >> >> >>>> >>> Error: /dev/rbd1: unrecognised disk label >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Even md5 different... >> >> >> >> >>>> >>> root@ix-s2:~# md5sum /dev/rbd0 >> >> >> >> >>>> >>> 9a47797a07fee3a3d71316e22891d752 /dev/rbd0 >> >> >> >> >>>> >>> root@ix-s2:~# md5sum /dev/rbd1 >> >> >> >> >>>> >>> e450f50b9ffa0073fae940ee858a43ce /dev/rbd1 >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Ok, now i protect snap and create clone... but same >> >> >> >> >>>> >>> thing... >> >> >> >> >>>> >>> md5 for clone same as for snap,, >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> root@test:~# rbd unmap /dev/rbd1 >> >> >> >> >>>> >>> root@test:~# rbd snap protect >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >> >> >> >> >>>> >>> root@test:~# rbd clone >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5@new_snap >> >> >> >> >>>> >>> cold-storage/test-image >> >> >> >> >>>> >>> root@test:~# rbd map cold-storage/test-image >> >> >> >> >>>> >>> /dev/rbd1 >> >> >> >> >>>> >>> root@test:~# md5sum /dev/rbd1 >> >> >> >> >>>> >>> e450f50b9ffa0073fae940ee858a43ce /dev/rbd1 >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> .... but it's broken... >> >> >> >> >>>> >>> root@test:~# parted /dev/rbd1 print >> >> >> >> >>>> >>> Error: /dev/rbd1: unrecognised disk label >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> ========= >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> tech details: >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> root@test:~# ceph -v >> >> >> >> >>>> >>> ceph version 0.94.2 >> >> >> >> >>>> >>> (5fb85614ca8f354284c713a2f9c610860720bbf3) >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> We have 2 inconstistent pgs, but all images not placed >> >> >> >> >>>> >>> on >> >> >> >> >>>> >>> this >> >> >> >> >>>> >>> pgs... >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> root@test:~# ceph health detail >> >> >> >> >>>> >>> HEALTH_ERR 2 pgs inconsistent; 18 scrub errors >> >> >> >> >>>> >>> pg 2.490 is active+clean+inconsistent, acting [56,15,29] >> >> >> >> >>>> >>> pg 2.c4 is active+clean+inconsistent, acting [56,10,42] >> >> >> >> >>>> >>> 18 scrub errors >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> ============ >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> root@test:~# ceph osd map cold-storage >> >> >> >> >>>> >>> 0e23c701-401d-4465-b9b4-c02939d57bb5 >> >> >> >> >>>> >>> osdmap e16770 pool 'cold-storage' (2) object >> >> >> >> >>>> >>> '0e23c701-401d-4465-b9b4-c02939d57bb5' -> pg 2.74458f70 >> >> >> >> >>>> >>> (2.770) >> >> >> >> >>>> >>> -> up >> >> >> >> >>>> >>> ([37,15,14], p37) acting ([37,15,14], p37) >> >> >> >> >>>> >>> root@test:~# ceph osd map cold-storage >> >> >> >> >>>> >>> 0e23c701-401d-4465-b9b4-c02939d57bb5@snap >> >> >> >> >>>> >>> osdmap e16770 pool 'cold-storage' (2) object >> >> >> >> >>>> >>> '0e23c701-401d-4465-b9b4-c02939d57bb5@snap' -> pg >> >> >> >> >>>> >>> 2.793cd4a3 >> >> >> >> >>>> >>> (2.4a3) >> >> >> >> >>>> >>> -> up >> >> >> >> >>>> >>> ([12,23,17], p12) acting ([12,23,17], p12) >> >> >> >> >>>> >>> root@test:~# ceph osd map cold-storage >> >> >> >> >>>> >>> 0e23c701-401d-4465-b9b4-c02939d57bb5@test-image >> >> >> >> >>>> >>> osdmap e16770 pool 'cold-storage' (2) object >> >> >> >> >>>> >>> '0e23c701-401d-4465-b9b4-c02939d57bb5@test-image' -> pg >> >> >> >> >>>> >>> 2.9519c2a9 >> >> >> >> >>>> >>> (2.2a9) >> >> >> >> >>>> >>> -> up ([12,44,23], p12) acting ([12,44,23], p12) >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Also we use cache layer, which in current moment - in >> >> >> >> >>>> >>> forward >> >> >> >> >>>> >>> mode... >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Can you please help me with this.. As my brain stop to >> >> >> >> >>>> >>> understand >> >> >> >> >>>> >>> what is >> >> >> >> >>>> >>> going on... >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> Thank in advance! >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> >> >> >> >> >>>> >>> _______________________________________________ >> >> >> >> >>>> >>> ceph-users mailing list >> >> >> >> >>>> >>> ceph-users@xxxxxxxxxxxxxx >> >> >> >> >>>> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> >>>> >>> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> >> >> >> >> >>>> >> -- >> >> >> >> >>>> >> >> >> >> >> >>>> >> Andrija Panić >> >> >> >> >>>> >> >> >> >> >> >>>> >> _______________________________________________ >> >> >> >> >>>> >> ceph-users mailing list >> >> >> >> >>>> >> ceph-users@xxxxxxxxxxxxxx >> >> >> >> >>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> >>>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >> >> >> >> > >> >> >> > >> >> > >> >> > >> > >> > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com