Hi, Thanks Gregory and Robert, now it is a bit clearer. After cache-flush-evict-all almost all objects were deleted, but 101 remained in cache pool. Also 1 pg changed its state to inconsistent with HEALTH_ERR. "ceph pg repair" changed objects count to 100, but at least ceph become healthy. Now it looks like: POOLS: NAME ID USED %USED MAX AVAIL OBJECTS rbd-cache 36 23185 0 157G 100 rbd 37 0 0 279G 0 # rados -p rbd-cache ls -all # rados -p rbd ls -all # Is there any way to find what the objects are? "ceph pg ls-by-pool rbd-cache" gives me pgs of the objects. Looking into these pgs gives me nothing I can understand :) # ceph pg ls-by-pool rbd-cache | head -4 pg_stat objects mip degr misp unf bytes log disklog state state_stamp v reported up up_primary acting acting_primary last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp 36.0 1 0 0 0 0 83 926 926 active+clean 2015-11-03 22:06:39.193371 798'926 798:640 [4,0,3] 4 [4,0,3] 4 798'926 2015-11-03 22:06:39.193321 798'926 2015-11-03 22:06:39.193321 36.1 1 0 0 0 0 193 854 854 active+clean 2015-11-03 18:28:51.190819 798'854 798:515 [1,4,3] 1 [1,4,3] 1 796'628 2015-11-03 18:28:51.190749 0'0 2015-11-02 18:28:42.546224 36.2 1 0 0 0 0 198 869 869 active+clean 2015-11-03 18:28:44.556048 798'869 798:554 [2,0,1] 2 [2,0,1] 2 796'650 2015-11-03 18:28:44.555980 0'0 2015-11-02 18:28:42.546226 # # find /var/lib/ceph/osd/ceph-0/current/36.0_head/ /var/lib/ceph/osd/ceph-0/current/36.0_head/ /var/lib/ceph/osd/ceph-0/current/36.0_head/__head_00000000__24 /var/lib/ceph/osd/ceph-0/current/36.0_head/hit\uset\u36.0\uarchive\u2015-11-03 11:12:37.962360\u2015-11-03 21:28:58.149662__head_00000000_.ceph-internal_24 # find /var/lib/ceph/osd/ceph-0/current/36.2_head/ /var/lib/ceph/osd/ceph-0/current/36.2_head/ /var/lib/ceph/osd/ceph-0/current/36.2_head/__head_00000002__24 /var/lib/ceph/osd/ceph-0/current/36.2_head/hit\uset\u36.2\uarchive\u2015-11-02 19:50:00.788736\u2015-11-03 21:29:02.460568__head_00000002_.ceph-internal_24 # # ls -l /var/lib/ceph/osd/ceph-0/current/36.0_head/hit\\uset\\u36.0\\uarchive\\u2015-11-03\ 11\:12\:37.962360\\u2015-11-03\ 21\:28\:58.149662__head_00000000_.ceph-internal_24 -rw-r--r--. 1 root root 83 Nov 3 21:28 /var/lib/ceph/osd/ceph-0/current/36.0_head/hit\uset\u36.0\uarchive\u2015-11-03 11:12:37.962360\u2015-11-03 21:28:58.149662__head_00000000_.ceph-internal_24 # # ls -l /var/lib/ceph/osd/ceph-0/current/36.2_head/hit\\uset\\u36.2\\uarchive\\u2015-11-02\ 19\:50\:00.788736\\u2015-11-03\ 21\:29\:02.460568__head_00000002_.ceph-internal_24 -rw-r--r--. 1 root root 198 Nov 3 21:29 /var/lib/ceph/osd/ceph-0/current/36.2_head/hit\uset\u36.2\uarchive\u2015-11-02 19:50:00.788736\u2015-11-03 21:29:02.460568__head_00000002_.ceph-internal_24 # -- Dmitry Glushenok Jet Infosystems > 3 нояб. 2015 г., в 20:11, Robert LeBlanc <robert@xxxxxxxxxxxxx> написал(а): > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Try: > > rados -p {cachepool} cache-flush-evict-all > > and see if the objects clean up. > - ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Tue, Nov 3, 2015 at 8:02 AM, Gregory Farnum wrote: >> When you have a caching pool in writeback mode, updates to objects >> (including deletes) are handled by writeback rather than writethrough. >> Since there's no other activity against these pools, there is nothing >> prompting the cache pool to flush updates out to the backing pool, so >> the backing pool hasn't deleted its objects because nothing's told it >> to. You'll find that the cache pool has deleted the data for its >> objects, but it's keeping around a small "whiteout" and the object >> info metadata. >> The "rados ls" you're using has never played nicely with cache tiering >> and probably never will. :( Listings are expensive operations and >> modifying them to do more than the simple info scan would be fairly >> expensive in terms of computation and IO. >> >> I think there are some caching commands you can send to flush updates >> which would cause the objects to be entirely deleted, but I don't have >> them off-hand. You can probably search the mailing list archives or >> the docs for tiering commands. :) >> -Greg >> >> On Tue, Nov 3, 2015 at 12:40 AM, Дмитрий Глушенок wrote: >>> Hi, >>> >>> While benchmarking tiered pool using rados bench it was noticed that objects are not being removed after test. >>> >>> Test was performed using "rados -p rbd bench 3600 write". The pool is not used by anything else. >>> >>> Just before end of test: >>> POOLS: >>> NAME ID USED %USED MAX AVAIL OBJECTS >>> rbd-cache 36 33110M 3.41 114G 8366 >>> rbd 37 43472M 4.47 237G 10858 >>> >>> Some time later (few hundreds of writes are flushed, rados automatic cleanup finished): >>> POOLS: >>> NAME ID USED %USED MAX AVAIL OBJECTS >>> rbd-cache 36 22998 0 157G 16342 >>> rbd 37 46050M 4.74 234G 11503 >>> >>> # rados -p rbd-cache ls | wc -l >>> 16242 >>> # rados -p rbd ls | wc -l >>> 11503 >>> # >>> >>> # rados -p rbd cleanup >>> error during cleanup: -2 >>> error 2: (2) No such file or directory >>> # >>> >>> # rados -p rbd cleanup --run-name "" --prefix prefix "" >>> Warning: using slow linear search >>> Removed 0 objects >>> # >>> >>> # rados -p rbd ls | head -5 >>> benchmark_data_dropbox01.tzk_7641_object10901 >>> benchmark_data_dropbox01.tzk_7641_object9645 >>> benchmark_data_dropbox01.tzk_7641_object10389 >>> benchmark_data_dropbox01.tzk_7641_object10090 >>> benchmark_data_dropbox01.tzk_7641_object11204 >>> # >>> >>> # rados -p rbd-cache ls | head -5 >>> benchmark_data_dropbox01.tzk_7641_object10901 >>> benchmark_data_dropbox01.tzk_7641_object9645 >>> benchmark_data_dropbox01.tzk_7641_object10389 >>> benchmark_data_dropbox01.tzk_7641_object5391 >>> benchmark_data_dropbox01.tzk_7641_object10090 >>> # >>> >>> So, it looks like the objects are still in place (in both pools?). But it is not possible to remove them: >>> >>> # rados -p rbd rm benchmark_data_dropbox01.tzk_7641_object10901 >>> error removing rbd>benchmark_data_dropbox01.tzk_7641_object10901: (2) No such file or directory >>> # >>> >>> # ceph health >>> HEALTH_OK >>> # >>> >>> >>> Can somebody explain the behavior? And is it possible to cleanup the benchmark data without recreating the pools? >>> >>> >>> ceph version 0.94.5 >>> >>> # ceph osd dump | grep rbd >>> pool 36 'rbd-cache' replicated size 3 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 100 pgp_num 100 last_change 755 flags hashpspool,incomplete_clones tier_of 37 cache_mode writeback target_bytes 107374182400 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0 >>> pool 37 'rbd' erasure size 5 min_size 3 crush_ruleset 2 object_hash rjenkins pg_num 100 pgp_num 100 last_change 745 lfor 745 flags hashpspool tiers 36 read_tier 36 write_tier 36 stripe_width 4128 >>> # >>> >>> # ceph osd pool get rbd-cache hit_set_type >>> hit_set_type: bloom >>> # ceph osd pool get rbd-cache hit_set_period >>> hit_set_period: 3600 >>> # ceph osd pool get rbd-cache hit_set_count >>> hit_set_count: 1 >>> # ceph osd pool get rbd-cache target_max_objects >>> target_max_objects: 0 >>> # ceph osd pool get rbd-cache target_max_bytes >>> target_max_bytes: 107374182400 >>> # ceph osd pool get rbd-cache cache_target_dirty_ratio >>> cache_target_dirty_ratio: 0.1 >>> # ceph osd pool get rbd-cache cache_target_full_ratio >>> cache_target_full_ratio: 0.2 >>> # >>> >>> Crush map: >>> root cache_tier { >>> id -7 # do not change unnecessarily >>> # weight 0.450 >>> alg straw >>> hash 0 # rjenkins1 >>> item osd.0 weight 0.090 >>> item osd.1 weight 0.090 >>> item osd.2 weight 0.090 >>> item osd.3 weight 0.090 >>> item osd.4 weight 0.090 >>> } >>> root store_tier { >>> id -8 # do not change unnecessarily >>> # weight 0.450 >>> alg straw >>> hash 0 # rjenkins1 >>> item osd.5 weight 0.090 >>> item osd.6 weight 0.090 >>> item osd.7 weight 0.090 >>> item osd.8 weight 0.090 >>> item osd.9 weight 0.090 >>> } >>> rule cache { >>> ruleset 1 >>> type replicated >>> min_size 0 >>> max_size 5 >>> step take cache_tier >>> step chooseleaf firstn 0 type osd >>> step emit >>> } >>> rule store { >>> ruleset 2 >>> type erasure >>> min_size 0 >>> max_size 5 >>> step take store_tier >>> step chooseleaf firstn 0 type osd >>> step emit >>> } >>> >>> Thanks >>> >>> -- >>> Dmitry Glushenok >>> Jet Infosystems >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -----BEGIN PGP SIGNATURE----- > Version: Mailvelope v1.2.3 > Comment: https://www.mailvelope.com > > wsFcBAEBCAAQBQJWOOqoCRDmVDuy+mK58QAAScEP/jdBEK0lH2u38S7hOwll > FA2J5+9lY0QAzxyTaRIzsZu0g+LSrLek39fFMTX/zHI0TMfSTM6dnPs6ucO2 > X59wbIyJQPEzzWEa2qn4F0j/QmWkMGfuMKDjxyT6OudpZtQKDS8mt13rnqAc > QH+0bQaVfjbKGowCAHyJXTsPK4qgew7dV3JDW2hVX/vIDjYAJomkF8Ll4miT > IQ6ViV2+9u4uW99ty4aSRnYUwaf7vqycK+qUT0Uohi6iTeym7s78O42Qa0p2 > WYHcXzAdYnBiR+qTIVWvKXn81tm4gmP4lSh0gpRoJ007c0hu5vTAnUvHRh0Z > 070NTrmAAJXN0oZ7lkoksYZVkXDJkwBZpdif69OQU3No/HhcY9JtagEMCXcc > 7/bUACaKjyKRzRmT3VNPQuMI0ix+tdi3PU4dL+16eBO832BNsqnyNHPxu570 > su1m4UQVGmoXCTUOeXYe9j4jzlHO/QRXcp/soFW5DgYO6JmylZzbyNmHPjMx > CiOsdhnjAylG/zq42S4zTfd+F//aRODGJ0JNHmQYm7M2sezAvQD1HyBCAwds > VfyOcfZwyeUNubtqssmOQ+n8/EDQciK4RH/hxG8bC8xsZaNgum7Z4/zA+Efc > gMuplOsBECODAoBlfA2TP3/XixzTXoVGMmdXolOhs+Z+tT+O22eKUEK7GbMG > rIWX > =qbDR > -----END PGP SIGNATURE----- _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com