Re: ceph cache tier pool objects not evicted automatically even when reaching full ratio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks a lot to Be-El from #ceph (irc://irc.oftc.net/ceph)
The problem is resolved after setting 'target_max_bytes' for cache pool:

$ ceph osd pool set cache target_max_bytes 184000000000

Because setting only 'cache_target_full_ratio' to 0.7 - is not sufficient for  cache tiering agent, it must know the number of bytes for specified pool that allowed to write, before starting evicting process. And then with both options are in place, cache tiering agent will start eviting objects from cache pool when there is 70% of target_max_bytes are written to that pool (when reacing 129Gb in my case)

here how it was:

[14:45] <Be-El> stannum: there's no limit configured for the cache pool. the system cannot know when to evict data
[14:46] <stannum> Be-El: what kind of limit?
[14:46] <Be-El> stannum: a limit for the size or the number of objects in the cache pool
[14:48] <stannum> Be-El: setting cache_target_full_ratio: 0.7 is not sufficient?
[14:48] <Be-El> stannum: that's a ratio. you need a reference for it
[14:51] <Be-El> stannum: try setting 'target_max_bytes' to the maximum size of the underlying storage minus some percent overhead and keep the replication factor in mind
[14:51] <stannum> Be-El: oh, there is not clear from documentaion. it say that: The cache tiering agent can flush or evict objects relative to the size of the cache pool. And later it says: The cache tiering agent can flush or evict objects based upon the total number of bytes or the total number of objects.
[14:52] <stannum> Be-El: and nothing said about to setting both options
[14:52] <Be-El> stannum: yes, the documentation is somewhat lacking. ceph cannot determine the amount of available space (and thus the maximum possible size of a pool)


10.03.2015 14:41, Kamil Kuramshin пишет:
    hi, folks! I'm testing cache tier for erasure coded pool and with RBD image on it. And now I'm facing a problem with full cache pool and object are not evicted automatically, Only if I run manually rados -p cache cache-flush-evict-all

    client side is:
    superuser@share:~$ uname -a
    Linux share 3.16-2-amd64 #1 SMP Debian 3.16.3-2 (2014-09-20) x86_64 GNU/Linux


    ceph node all are debian wheezy
    superuser~$ dpkg -l | grep ceph
    ii  ceph                           0.87-1~bpo70+1                   amd64        distributed storage and file system
    ii  ceph-common                    0.87-1~bpo70+1                   amd64        common utilities to mount and interact with a ceph storage cluster
    ii  ceph-fs-common                 0.87-1~bpo70+1                   amd64        common utilities to mount and interact with a ceph file system
    ii  ceph-mds                       0.87-1~bpo70+1                   amd64        metadata server for the ceph distributed file system
    ii  libcephfs1                     0.87-1~bpo70+1                   amd64        Ceph distributed file system client library
    ii  libcurl3-gnutls:amd64          7.29.0-1~bpo70+1.ceph            amd64        easy-to-use client-side URL transfer library (GnuTLS flavour)
    ii  python-ceph                    0.87-1~bpo70+1                   amd64        Python libraries for the Ceph distributed filesystem



    There are all steps to reproduce (excepting creation of pools):



    superuser@admin:~$ ceph osd pool get ec_backup-storage erasure_code_profile
    erasure_code_profile: default
    superuser@admin:~$ ceph osd erasure-code-profile get default
    directory=/usr/lib/ceph/erasure-code
    k=2
    m=1
    plugin=jerasure
    technique=reed_sol_van



    *********** ADMIN NODE OPERATIONS ************
     
    superuser@admin:~$ ceph df
    GLOBAL:
        SIZE     AVAIL     RAW USED     %RAW USED
        242T      224T        6092G          2.46
    POOLS:
        NAME                  ID     USED      %USED     MAX AVAIL     OBJECTS
        ec_backup-storage     4          0         0          147T           0
        cache                 5          0         0          185G           0
        block-devices         6      1948G      0.79        75638G      498771
    superuser@admin:~$ rados df
    pool name       category                 KB      objects       clones     degraded      unfound           rd        rd KB           wr        wr KB
    block-devices   -                 2042805201       498771            0            0           0        67127    259320535      2070571   2403248346
    cache           -                          0            0            0            0           0        60496    247235411       966553    499544074
    ec_backup-storage -                          0            0            0            0           0       156988    537227276       400355    819838985
      total used      6388431372       498771
      total avail   240559782780
      total space   260163775608
     
    ***** 'cache' pool is replicated pool, 'ec_backup-storage' - Erasure Encoded pool *****
    ***** running simple script for enabling cache tiering:


    ***** There is my simple script for enabling cache tier:
     
    superuser@admin:~$ ./enable_cache_tier.sh cache ec_backup-storage
    pool 'cache' is now (or already was) a tier of 'ec_backup-storage'
    set cache-mode for pool 'cache' to writeback
    overlay for 'ec_backup-storage' is now (or already was) 'cache'
    set pool 5 hit_set_type to bloom
    set pool 5 cache_target_dirty_ratio to 0.4
    set pool 5 cache_target_full_ratio to 0.7
    set pool 5 cache_min_flush_age to 10
    set pool 5 cache_min_evict_age to 10

    ***** Displaying some cache pool parameters:

    superuser@admin:~$ for param in cache_target_dirty_ratio cache_target_full_ratio target_max_bytes target_max_objects cache_min_flush_age cache_min_evict_age; do  ceph osd pool get cache $param; done
    cache_target_dirty_ratio: 0.4
    cache_target_full_ratio: 0.7
    target_max_bytes: 0
    target_max_objects: 0
    cache_min_flush_age: 10
    cache_min_evict_age: 10


    *********** END ADMIN NODE OPERATIONS ************
     
    *********** CEPH CLIENT OPERATIONS ************
     
    superuser@share:~$ rbd create -p  ec_backup-storage ec_image.img --size 500000 --image-format 2
    superuser@share:~$ rbd -p ec_backup-storage ls
    ec_image.img
    superuser@share:~$ sudo rbd map -p ec_backup-storage ec_image.img
    /dev/rbd0
    superuser@share:~$ rbd showmapped
    id pool              image        snap device    
    0  ec_backup-storage ec_image.img -    /dev/rbd0
    superuser@share:~$ sudo parted /dev/rbd0 p
    Error: /dev/rbd0: unrecognised disk label
    Model: Unknown (unknown)                                                  
    Disk /dev/rbd0: 524GB
    Sector size (logical/physical): 512B/512B
    Partition Table: unknown
    Disk Flags:
     
     
    **** now start filling the rbd disk with 400Gb of 'zeros' ****
    superuser@share:~$ sudo dd if=/dev/zero of=/dev/rbd0 bs=4M count=100000 oflag=direct
    ^C39901+0 записей получено ; There I'm pressed CTRL+C because already getting warnings in cpeh -s (see below)
    39901+0 записей отправлено
     скопировано 167356923904 байта (167 GB), 3387,59 c, 49,4 MB/c


    *********** END CEPH CLIENT OPERATIONS ************


    *********** ADMIN NODE OPERATIONS ************


    **** Now check ceph df: 145G of 172G 'cache' pool it is about 90% is occupied already!
    superuser@admin:~$ ceph df
    GLOBAL:
        SIZE     AVAIL     RAW USED     %RAW USED
        242T      223T        6555G          2.64
    POOLS:
        NAME                  ID     USED      %USED     MAX AVAIL     OBJECTS
        ec_backup-storage     4          0         0          147T           0
        cache                 5       155G      0.06        12333M       39906
        block-devices         6      1948G      0.79        75642G      498771
     
       
    ***** ec_backup-storage - tier pool for cold data still empty????
    ***** but in the same time:
    superuser@admin:~$ ceph health detail
    HEALTH_WARN 3 near full osd(s)
    osd.45 is near full at 88%
    osd.95 is near full at 87%
    osd.100 is near full at 86%


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux