Wrong object and used space count in cache tier pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello ceph-users,

I am currently making tests on a small cluster, and Cache Tiering is one of those tests. The cluster runs Ceph 0.87 Giant on three Ubuntu 14.04 servers with the 3.16.0 kernel, for a total of 8 OSD and 1 MON.

Since there are no SSDs in those servers, I am testing Cache Tiering by using an erasure-coded pool as storage and a replicated pool as cache. The cache settings are the "defaults" ones you'll find in the documentation, and I'm using writeback mode. Also, to simulate the small size of cache data, the hot storage pool has a 1024MB space quota. Then I write 4MB chunks of data to the storage pool using 'rados bench' (with --no-cleanup).

Here are my cache pool settings according to InkScope :
pool                            15
pool name                       test1_ct-cache
auid                            0
type                            1 (replicated)
size                            2
min size                        1
crush ruleset                   0 (replicated_ruleset)
pg num                          512
pg placement_num                512
quota max_bytes                 1 GB
quota max_objects               0
flags names                     hashpspool,incomplete_clones
tiers                           none
tier of                         14 (test1_ec-data)
read tier                       -1
write tier                      -1
cache mode                      writeback
cache target_dirty_ratio_micro  40 %
cache target_full_ratio_micro   80 %
cache min_flush_age             0 s
cache min_evict_age             0 s
target max_objects              0
target max_bytes                960 MB
hit set_count                   1
hit set_period                  3600 s
hit set_params  target_size :                0
                seed :                       0
                type :                       bloom
                false_positive_probability : 0.050000

I believe the tiering itself works well, I do see objects and bytes being transfered from the cache to the storage when I write data. I checked with 'rados ls', and the object count in the cold storage is always right on spot. But it isn't in the cache, when I do 'ceph df' or 'rados df' the space and object counts do not match with 'rados ls', and are usually much larger :

% ceph df
…
POOLS:
    NAME               ID     USED       %USED     MAX AVAIL OBJECTS
    …
    test1_ec-data      14      5576M      0.04        11115G 1394
    test1_ct-cache     15       772M         0         7410G 250
% rados -p test1_ec-data ls | wc -l
1394
% rados -p test1_ct-cache ls | wc -l
56
# And this corresponds to 220M of data in test1_ct-cache

Not only it prevents me from knowing exactly what the cache is doing, but it is also this value that is applied for the quota. And I've seen writing operations fail because the space count had reached 1G, although I was quite sure there was enough space. The count does not correct itself over time, even by waiting overnight. The count only changes when I "poke" the pool by changing a setting or writing data, but remains wrong (and not by the same number of objects). The changes in object counts given by 'rados ls' in both pools match with the number of objects written by 'rados bench'.

Does anybody know where this mismatch might come from ? Is there a way to see more details about what's going on ? Or is it the normal behavior of a cache pool when 'rados bench' is used ?

Thank you in advance for any help.
Regards,
--
Xavier

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux