Re: Data distribution question

Shain Miley <smiley@xxxxxxx> · Thu, 2 May 2019 12:45:57 -0400



    Just to follow up on this:
    I enabled up enabling the balancer module in upmap mode.
    This did resolve the the short term issue and even things out a
      bit...but things are still far from uniform.
    It seems like the balancer option is an ongoing process that
      continues to run over time...so maybe things will improve even
      more over the next few weeks.
    Thank you to everyone who helped provide insight into possible
      solutions.
    Shain

    
    On 4/30/19 2:08 PM, Dan van der Ster
      wrote:

    
        Removing pools won't make a difference.
          

          Read up to slide 22 here: https://www.slideshare.net/mobile/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer
          

          ..
          Dan
          

          (Apologies for terseness, I'm mobile)
          

            On Tue, 30 Apr 2019, 20:02
              Shain Miley, <smiley@xxxxxxx> wrote:

            
            Here is
              the per pool pg_num info:

              
              'data' pg_num 64

              'metadata' pg_num 64

              'rbd' pg_num 64

              'npr_archive' pg_num 6775

              '.rgw.root' pg_num 64

              '.rgw.control' pg_num 64

              '.rgw' pg_num 64

              '.rgw.gc' pg_num 64

              '.users.uid' pg_num 64

              '.users.email' pg_num 64

              '.users' pg_num 64

              '.usage' pg_num 64

              '.rgw.buckets.index' pg_num 128

              '.intent-log' pg_num 8

              '.rgw.buckets' pg_num 64

              'kube' pg_num 512

              '.log' pg_num 8

              
              Here is the df output:

              
              GLOBAL:

                   SIZE        AVAIL      RAW USED     %RAW USED

                   1.06PiB     306TiB       778TiB         71.75

              POOLS:

                   NAME                   ID     USED        %USED MAX
              AVAIL     OBJECTS

                   data                   0      11.7GiB      0.14
              8.17TiB         3006

                   metadata               1           0B         0
              8.17TiB            0

                   rbd                    2      43.2GiB      0.51
              8.17TiB        11147

                   npr_archive            3       258TiB     97.93
              5.45TiB     82619649

                   .rgw.root              4        1001B         0
              8.17TiB            5

                   .rgw.control           5           0B         0
              8.17TiB            8

                   .rgw                   6      6.16KiB         0
              8.17TiB           35

                   .rgw.gc                7           0B         0
              8.17TiB           32

                   .users.uid             8           0B         0
              8.17TiB            0

                   .users.email           9           0B         0
              8.17TiB            0

                   .users                 10          0B         0
              8.17TiB            0

                   .usage                 11          0B         0
              8.17TiB            1

                   .rgw.buckets.index     12          0B         0
              8.17TiB           26

                   .intent-log            17          0B         0
              5.45TiB            0

                   .rgw.buckets           18     24.2GiB      0.29
              8.17TiB         6622

                   kube                   21     1.82GiB      0.03
              5.45TiB          550

                   .log                   22          0B         0
              5.45TiB          176

              
              The stuff in the data pool and the rwg pools is old data
              that we used 

              for testing...if you guys think that removing everything
              outside of rbd 

              and npr_archive would make a significant impact I will
              give it a try.

              
              Thanks,

              
              Shain

              
              On 4/30/19 1:15 PM, Jack wrote:

              > Hi,

              >

              > I see that you are using rgw

              > RGW comes with many pools, yet most of them are used
              for metadata and

              > configuration, those do not store many data

              > Such pools do not need more than a couple PG, each (I
              use pg_num = 8)

              >

              > You need to allocate your pg on pool that actually
              stores the data

              >

              > Please do the following, to let us know more:

              > Print the pg_num per pool:

              > for i in $(rados lspools); do echo -n "$i: "; ceph
              osd pool get $i

              > pg_num; done

              >

              > Print the usage per pool:

              > ceph df

              >

              > Also, instead of doing a "ceph osd
              reweight-by-utilization", check out

              > the balancer plugin : https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_mimic_mgr_balancer_&d=DwICAg&c=E2nBno7hEddFhl23N5nD1Q&r=cqFccwnwHGRorPuRWs36Dw&m=1BfaF7xeFT_o8pdT9mrRmWm0gCn4wgalDi3UviTy24M&s=YoiU-wa-ZXHUEj8xYmiSVRVnXnDenoUaRZMa-bfRFvo&e=

              >

              > Finally, in nautilus, the pg can now upscale and
              downscale automaticaly

              > See https://urldefense.proofpoint.com/v2/url?u=https-3A__ceph.com_rados_new-2Din-2Dnautilus-2Dpg-2Dmerging-2Dand-2Dautotuning_&d=DwICAg&c=E2nBno7hEddFhl23N5nD1Q&r=cqFccwnwHGRorPuRWs36Dw&m=1BfaF7xeFT_o8pdT9mrRmWm0gCn4wgalDi3UviTy24M&s=7-W9i3gJAcCtrL7MzjJlG5LZ_91zeesYBT7g0rGrLh0&e=

              >

              >

              > On 04/30/2019 06:34 PM, Shain Miley wrote:

              >> Hi,

              >>

              >> We have a cluster with 235 osd's running version
              12.2.11 with a

              >> combination of 4 and 6 TB drives.  The data
              distribution across osd's

              >> varies from 52% to 94%.

              >>

              >> I have been trying to figure out how to get this
              a bit more balanced as

              >> we are running into 'backfillfull' issues on a
              regular basis.

              >>

              >> I've tried adding more pgs...but this did not
              seem to do much in terms

              >> of the imbalance.

              >>

              >> Here is the end output from 'ceph osd df':

              >>

              >> MIN/MAX VAR: 0.73/1.31  STDDEV: 7.73

              >>

              >> We have 8199 pgs total with 6775 of them in the
              pool that has 97% of the

              >> data.

              >>

              >> The other pools are not really used (data,
              metadata, .rgw.root,

              >> .rgw.control, etc).  I have thought about
              deleting those unused pools so

              >> that most if not all the pgs are being used by
              the pool with the

              >> majority of the data.

              >>

              >> However...before I do that...there anything else
              I can do or try in

              >> order to see if I can balance out the data more
              uniformly?

              >>

              >> Thanks in advance,

              >>

              >> Shain

              >>

              > _______________________________________________

              > ceph-users mailing list

              > ceph-users@xxxxxxxxxxxxxx

              > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwICAg&c=E2nBno7hEddFhl23N5nD1Q&r=cqFccwnwHGRorPuRWs36Dw&m=1BfaF7xeFT_o8pdT9mrRmWm0gCn4wgalDi3UviTy24M&s=BczlpHmYiubLlNUhgDHcEsVHAsR_RYCKYV2G_5w2Vio&e=

              
              -- 

              NPR | Shain Miley | Manager of Infrastructure, Digital
              Media | smiley@xxxxxxx
              | 202.513.3649

              
              _______________________________________________

              ceph-users mailing list

              ceph-users@xxxxxxxxxxxxxx

              http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

            
    -- 
NPR | Shain Miley | Manager of Infrastructure, Digital Media | smiley@xxxxxxx | 202.513.3649
  

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com