On Wed, 28 Jan 2015, Irek Fasikhov wrote: > Sage. > Is a sentence when deleting objects bypass the cache tier pool. There's currently no knob or hint to do that. It would be pretty simple to add, but it's a heuristic that only works for certain workloads.. sage > Thank > > Wed Jan 28 2015 at 5:13:36 PM, Irek Fasikhov <malmyzh@xxxxxxxxx>: > Hi,Sage. > > Yes, Firefly. > [root@ceph05 ~]# ceph --version > ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7) > > Yes, I have seen this behavior. > > [root@ceph08 ceph]# rbd info vm-160-disk-1 > rbd image 'vm-160-disk-1': > size 32768 MB in 8192 objects > order 22 (4096 kB objects) > block_name_prefix: rbd_data.179faf52eb141f2 > format: 2 > features: layering > parent: rbd/base-145-disk-1@__base__ > overlap: 32768 MB > [root@ceph08 ceph]# rbd rm vm-160-disk-1 > Removing image: 100% complete...done. > [root@ceph08 ceph]# rbd info vm-160-disk-1 > 2015-01-28 10:39:01.595785 7f1fbea9e760 -1 librbd::ImageCtx: error > finding header: (2) No such file or directoryrbd: error opening image > vm-160-disk-1: (2) No such file or directory > > [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc > 5944 5944 249633 > [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc > 5857 5857 245979 > [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc > 4377 4377 183819 > [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc > 5017 5017 210699 > [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc > 5015 5015 210615 > [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc > [root@ceph08 ceph]# rados -p rcachehe ls | grep 179faf52eb141f2 | wc > 1986 1986 83412 > [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc > 981 981 41202 > [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc > 802 802 33684 > [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc > 1611 1611 67662 > > Thank, Sage! > > > Tue Jan 27 2015 at 7:01:43 PM, Sage Weil <sage@xxxxxxxxxxxx>: > > On Tue, 27 Jan 2015, Irek Fasikhov wrote: > > Hi,All. > > Indeed, there is a problem. Removed 1 TB of data space > on a cluster is not > > cleared. This feature of the behavior or a bug? And how > long will it be > > cleaned? > > Your subject says cache tier but I don't see it in the > 'ceph df' output > below. The cache tiers will store 'whiteout' objects that > cache object > non-existence that could be delaying some deletion. You > can wrangle the > cluster into flushing those with > > ceph osd pool set <cachepool> cache_target_dirty_ratio > .05 > > (though you'll probably want to change it back to the > default .4 later). > > If there's no cache tier involved, there may be another > problem. What > version is this? Firefly? > > sage > > > > > Sat Sep 20 2014 at 8:19:24 AM, Mika?l Cluseau > <mcluseau@xxxxxx>: > > Hi all, > > > > I have weird behaviour on my firefly "test + > convenience > > storage" cluster. It consists of 2 nodes with a > light imbalance > > in available space: > > > > # id weight type name up/down reweight > > -1 14.58 root default > > -2 8.19 host store-1 > > 1 2.73 osd.1 up 1 > > 0 2.73 osd.0 up 1 > > 5 2.73 osd.5 up 1 > > -3 6.39 host store-2 > > 2 2.73 osd.2 up 1 > > 3 2.73 osd.3 up 1 > > 4 0.93 osd.4 up 1 > > > > I used to store ~8TB of rbd volumes, coming to a > near-full > > state. There was some annoying "stuck misplaced" > PGs so I began > > to remove 4.5TB of data; the weird thing is: the > space hasn't > > been reclaimed on the OSDs, they keeped stuck > around 84% usage. > > I tried to move PGs around and it happens that the > space is > > correctly "reclaimed" if I take an OSD out, let > him empty it XFS > > volume and then take it in again. > > > > I'm currently applying this to and OSD in turn, > but I though it > > could be worth telling about this. The current > ceph df output > > is: > > > > GLOBAL: > > SIZE AVAIL RAW USED %RAW USED > > 12103G 5311G 6792G 56.12 > > POOLS: > > NAME ID USED > %USED OBJECTS > > data 0 0 > 0 0 > > metadata 1 0 > 0 0 > > rbd 2 444G > 3.67 117333 > > [...] > > archives-ec 14 3628G > 29.98 928902 > > archives 15 37518M > 0.30 273167 > > > > Before "just moving data", AVAIL was around 3TB. > > > > I finished the process with the OSDs on store-1, > who show the > > following space usage now: > > > > /dev/sdb1 2.8T 1.4T 1.4T 50% > > /var/lib/ceph/osd/ceph-0 > > /dev/sdc1 2.8T 1.3T 1.5T 46% > > /var/lib/ceph/osd/ceph-1 > > /dev/sdd1 2.8T 1.3T 1.5T 48% > > /var/lib/ceph/osd/ceph-5 > > > > I'm currently fixing OSD 2, 3 will be the last one > to be fixed. > > The df on store-2 shows the following: > > > > /dev/sdb1 2.8T 1.9T 855G 70% > > /var/lib/ceph/osd/ceph-2 > > /dev/sdc1 2.8T 2.4T 417G 86% > > /var/lib/ceph/osd/ceph-3 > > /dev/sdd1 932G 481G 451G 52% > > /var/lib/ceph/osd/ceph-4 > > > > OSD 2 was at 84% 3h ago, and OSD 3 was ~75%. > > > > During rbd rm (that took a bit more that 3 days), > ceph log was > > showing things like that: > > > > 2014-09-03 16:17:38.831640 mon.0 > 192.168.1.71:6789/0 417194 : > > [INF] pgmap v14953987: 3196 pgs: 2882 > active+clean, 314 > > active+remapped; 7647 GB data, 11067 GB used, 3828 > GB / 14896 GB > > avail; 0 B/s rd, 6778 kB/s wr, 18 op/s; -5/5757286 > objects > > degraded (-0.000%) > > [...] > > 2014-09-05 03:09:59.895507 mon.0 > 192.168.1.71:6789/0 513976 : > > [INF] pgmap v15050766: 3196 pgs: 2882 > active+clean, 314 > > active+remapped; 6010 GB data, 11156 GB used, 3740 > GB / 14896 GB > > avail; 0 B/s rd, 0 B/s wr, 8 op/s; -388631/5247320 > objects > > degraded (-7.406%) > > [...] > > 2014-09-06 03:56:50.008109 mon.0 > 192.168.1.71:6789/0 580816 : > > [INF] pgmap v15117604: 3196 pgs: 2882 > active+clean, 314 > > active+remapped; 4865 GB data, 11207 GB used, 3689 > GB / 14896 GB > > avail; 0 B/s rd, 6117 kB/s wr, 22 op/s; > -706519/3699415 objects > > degraded (-19.098%) > > 2014-09-06 03:56:44.476903 osd.0 > 192.168.1.71:6805/11793 729 : > > [WRN] 1 slow requests, 1 included below; oldest > blocked for > > > 30.058434 secs > > 2014-09-06 03:56:44.476909 osd.0 > 192.168.1.71:6805/11793 730 : > > [WRN] slow request 30.058434 seconds old, received > at 2014-09-06 > > 03:56:14.418429: osd_op(client.19843278.0:46081 > > rb.0.c7fd7f.238e1f29.00000000b3fa [delete] > 15.b8fb7551 > > ack+ondisk+write e38950) v4 currently waiting for > blocked object > > 2014-09-06 03:56:49.477785 osd.0 > 192.168.1.71:6805/11793 731 : > > [WRN] 2 slow requests, 1 included below; oldest > blocked for > > > 35.059315 secs > > [... stabilizes here:] > > 2014-09-06 22:13:48.771531 mon.0 > 192.168.1.71:6789/0 632527 : > > [INF] pgmap v15169313: 3196 pgs: 2882 > active+clean, 314 > > active+remapped; 4139 GB data, 11215 GB used, 3681 > GB / 14896 GB > > avail; 64 B/s rd, 64 B/s wr, 0 op/s; > -883219/3420796 objects > > degraded (-25.819%) > > [...] > > 2014-09-07 03:09:48.491325 mon.0 > 192.168.1.71:6789/0 633880 : > > [INF] pgmap v15170666: 3196 pgs: 2882 > active+clean, 314 > > active+remapped; 4139 GB data, 11215 GB used, 3681 > GB / 14896 GB > > avail; 18727 B/s wr, 2 op/s; -883219/3420796 > objects degraded > > (-25.819%) > > > > And now, during data movement I described before: > > > > 2014-09-20 15:16:13.394694 mon.0 [INF] pgmap > v15344707: 3196 > > pgs: 2132 active+clean, 432 > active+remapped+wait_backfill, 621 > > active+remapped, 11 active+remapped+backfilling; > 4139 GB data, > > 6831 GB used, 5271 GB / 12103 GB avail; > 379097/3792969 objects > > degraded (9.995%) > > > > If some ceph developer wants me to do something or > to provide > > some data, please say so quickly, I will probably > process OSD 3 > > in ~16-20h. > > (of course, I'd prefer not loose the data btw :-)) > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > >
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com