Re: bluestore: osd bluestore_allocated is much larger than bluestore_stored

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Please note that simple min_alloc_size downsizing might negatively impact OSD performance. That's why this modification has  been postponed till Pacific - we've made a bunch of additional changes to eliminate the drop.


Regards,

Igor

On 7/8/2020 12:32 PM, Jerry Pu wrote:
Thanks for your reply. It's helpful! We may consider to adjust min_alloc_size to a lower value or take other actions based on your analysis for space overhead with EC pools. Thanks.

Best
Jerry Pu

Igor Fedotov <ifedotov@xxxxxxx <mailto:ifedotov@xxxxxxx>> 於 2020年7月7日 週二 下午4:10寫道:

    I think you're facing the issue covered by the following ticket:

    https://tracker.ceph.com/issues/44213


    Unfortunately the only known solution is migrating to 4K min alloc
    size
    which to be available since Pacific.


    Thanks,

    Igor

    On 7/7/2020 6:38 AM, Jerry Pu wrote:
    > Hi:
    >
    >           We have a cluster (v13.2.4), and we do some tests on a
    EC k=2, m=1
    > pool "VMPool0". We deploy some VMs (Windows, CentOS7) on the
    pool and then
    > use IOMeter to write data to these VMs. After a period of time,
    we observe
    > a strange thing that pool actual usage is much larger than
    stored data *
    > 1.5 (stored_raw).
    >
    > [root@Sim-str-R6-4 ~]# ceph df
    > GLOBAL:
    >      CLASS     SIZE        AVAIL       USED        RAW USED   
     %RAW USED
    >        hdd     3.5 TiB     1.8 TiB     1.7 TiB      1.7 TiB     
       48.74
    >      TOTAL     3.5 TiB     1.8 TiB     1.7 TiB      1.7 TiB     
       48.74
    > POOLS:
    >      NAME                 ID     USED        %USED MAX AVAIL   
     OBJECTS
    >      cephfs_data          1       29 GiB     100.00        0 B 
          2597
    >      cephfs_md            2      831 MiB     100.00        0 B 
           133
    >      erasure_meta_hdd     3       22 MiB     100.00        0 B 
           170
    >      VMPool0              4      1.2 TiB      56.77    644 GiB 
        116011
    >      stresspool           5      2.6 MiB     100.00        0 B 
            32
    >
    > [root@Sim-str-R6-4 ~]# ceph df detail -f json-pretty
    > -----snippet-----
    >          {
    >              "name": "VMPool0",
    >              "id": 4,
    >              "stats": {
    >                  "kb_used": 1328888832,
    >                  "bytes_used": 1360782163968, <----------------
    >                  "percent_used": 0.567110,
    >                  "max_avail": 692481687552,
    >                  "objects": 116011,
    >                  "quota_objects": 0,
    >                  "quota_bytes": 0,
    >                  "dirty": 116011,
    >                  "rd": 27449034,
    >                  "rd_bytes": 126572760064,
    >                  "wr": 20675381,
    >                  "wr_bytes": 1006460652544,
    >                  "comp_ratio": 1.000000,
    >                  "stored": 497657610240,
    >                  "stored_raw": 746486431744,  <----------------
    >              }
    >          },
    >
    >          The perf counters of all osds (all hdd) used by VMPool0
    also show
    > that bluestore_allocated is much larger than bluestore_stored.
    >
    > [root@Sim-str-R6-4 ~]# for i in {0..3}; do echo $i; ceph daemon
    osd.$i perf
    > dump | grep bluestore | head -6; done
    > 0
    >      "bluestore": {
    >          "bluestore_allocated": 175032369152, <----------------
    >          "bluestore_stored": 83557936482, <----------------
    >          "bluestore_compressed": 958795770,
    >          "bluestore_compressed_allocated": 6431965184,
    >          "bluestore_compressed_original": 18576584704,
    > 1
    >      "bluestore": {
    >          "bluestore_allocated": 119943593984, <----------------
    >          "bluestore_stored": 53325238866, <----------------
    >          "bluestore_compressed": 670158436,
    >          "bluestore_compressed_allocated": 4751818752,
    >          "bluestore_compressed_original": 13752328192,
    > 2
    >      "bluestore": {
    >          "bluestore_allocated": 155444707328, <----------------
    >          "bluestore_stored": 69067116553, <----------------
    >          "bluestore_compressed": 565170876,
    >          "bluestore_compressed_allocated": 4614324224,
    >          "bluestore_compressed_original": 13469696000,
    > 3
    >      "bluestore": {
    >          "bluestore_allocated": 128179240960, <----------------
    >          "bluestore_stored": 60884752114, <----------------
    >          "bluestore_compressed": 1653455847,
    >          "bluestore_compressed_allocated": 9741795328,
    >          "bluestore_compressed_original": 27878768640,
    >
    > [root@Sim-str-R6-5 osd]# for i in {4..7}; do echo $i; sh -c
    "ceph daemon
    > osd.$i perf dump | grep bluestore | head -6"; done
    > 4
    >      "bluestore": {
    >          "bluestore_allocated": 165950652416, <----------------
    >          "bluestore_stored": 80255191687, <----------------
    >          "bluestore_compressed": 1526871060,
    >          "bluestore_compressed_allocated": 8900378624,
    >          "bluestore_compressed_original": 25324142592,
    > 5
    > admin_socket: exception getting command descriptions: [Errno 111]
    > Connection refused
    > 6
    >      "bluestore": {
    >          "bluestore_allocated": 166022152192, <----------------
    >          "bluestore_stored": 84645390708, <----------------
    >          "bluestore_compressed": 1169055606,
    >          "bluestore_compressed_allocated": 8647278592,
    >          "bluestore_compressed_original": 25135091712,
    > 7
    >      "bluestore": {
    >          "bluestore_allocated": 204633604096, <----------------
    >          "bluestore_stored": 100116382041, <----------------
    >          "bluestore_compressed": 1081260422,
    >          "bluestore_compressed_allocated": 6510018560,
    >          "bluestore_compressed_original": 18654052352,
    >
    > [root@Sim-str-R6-6 osd]# for i in {8..12}; do echo $i; ceph
    daemon osd.$i
    > perf dump | grep bluestore | head -6; done
    > 8
    >      "bluestore": {
    >          "bluestore_allocated": 106330193920, <----------------
    >          "bluestore_stored": 45282848089, <----------------
    >          "bluestore_compressed": 1136610231,
    >          "bluestore_compressed_allocated": 7248609280,
    >          "bluestore_compressed_original": 20882960384,
    > 9
    >      "bluestore": {
    >          "bluestore_allocated": 120657412096, <----------------
    >          "bluestore_stored": 52550745942, <----------------
    >          "bluestore_compressed": 1321632665,
    >          "bluestore_compressed_allocated": 7401504768,
    >          "bluestore_compressed_original": 21073027072,
    > 10
    >      "bluestore": {
    >          "bluestore_allocated": 155985772544, <----------------
    >          "bluestore_stored": 73236910054, <----------------
    >          "bluestore_compressed": 98351920,
    >          "bluestore_compressed_allocated": 772210688,
    >          "bluestore_compressed_original": 2242043904,
    > 11
    >      "bluestore": {
    >          "bluestore_allocated": 106040524800, <----------------
    >          "bluestore_stored": 45353612134, <----------------
    >          "bluestore_compressed": 874216443,
    >          "bluestore_compressed_allocated": 4962844672,
    >          "bluestore_compressed_original": 14160310272,
    > 12
    >      "bluestore": {
    >          "bluestore_allocated": 118751363072, <----------------
    >          "bluestore_stored": 52194408691, <----------------
    >          "bluestore_compressed": 782919969,
    >          "bluestore_compressed_allocated": 5546311680,
    >          "bluestore_compressed_original": 16043233280,
    >
    >         The config min_alloc_size_hdd of all osds are 64K and
    rados objects
    > of VMPool0 are all 4M rbd_data.x.xxxxxxx.xxxxxxx objects. It's
    kind of
    > strange that allocation space is much larger than stored data.
    Can anyone
    > explains this?
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    > To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux