Re: RBD over cache tier over EC pool: rbd rm doesn't remove objects

Irek Fasikhov <malmyzh@xxxxxxxxx> · Tue, 27 Jan 2015 07:24:42 +0000

Hi,All.
Indeed, there is a problem. Removed 1 TB of data space on a cluster is not cleared. This feature of the behavior or a bug? And how long will it be cleaned?

Sat Sep 20 2014 at 8:19:24 AM, Mikaël Cluseau <mcluseau@xxxxxx>:

    Hi all,

    I have weird behaviour on my firefly "test + convenience storage"
    cluster. It consists of 2 nodes with a light imbalance in available
    space:

    # id    weight    type name    up/down    reweight

    -1    14.58    root default

    -2    8.19        host store-1

    1    2.73            osd.1    up    1    

    0    2.73            osd.0    up    1    

    5    2.73            osd.5    up    1    

    -3    6.39        host store-2

    2    2.73            osd.2    up    1    

    3    2.73            osd.3    up    1    

    4    0.93            osd.4    up    1    

    I used to store ~8TB of rbd volumes, coming to a near-full state.
    There was some annoying "stuck misplaced" PGs so I began to remove
    4.5TB of data; the weird thing is: the space hasn't been reclaimed
    on the OSDs, they keeped stuck around 84% usage. I tried to move PGs
    around and it happens that the space is correctly "reclaimed" if I
    take an OSD out, let him empty it XFS volume and then take it in
    again.

    I'm currently applying this to and OSD in turn, but I though it
    could be worth telling about this. The current ceph df output is:

    GLOBAL:

        SIZE       AVAIL     RAW USED     %RAW USED 

        12103G     5311G     6792G        56.12     

    POOLS:

        NAME                 ID     USED       %USED    
      OBJECTS 

        data                 0      0          0        
      0       

        metadata             1      0          0        
      0       

        rbd                  2      444G       3.67     
      117333  

    [...]

        archives-ec          14     3628G      29.98    
      928902  

        archives             15     37518M     0.30      273167

    Before "just moving data", AVAIL was around 3TB.

    I finished the process with the OSDs on store-1, who show the
    following space usage now:

    /dev/sdb1             2.8T  1.4T  1.4T  50%
      /var/lib/ceph/osd/ceph-0

    /dev/sdc1             2.8T  1.3T  1.5T  46%
      /var/lib/ceph/osd/ceph-1

    /dev/sdd1             2.8T  1.3T  1.5T  48%
      /var/lib/ceph/osd/ceph-5

    I'm currently fixing OSD 2, 3 will be the last one to be fixed. The
    df on store-2 shows the following:

    /dev/sdb1               2.8T  1.9T  855G  70%
      /var/lib/ceph/osd/ceph-2

    /dev/sdc1               2.8T  2.4T  417G  86%
      /var/lib/ceph/osd/ceph-3

    /dev/sdd1               932G  481G  451G  52%
      /var/lib/ceph/osd/ceph-4

    OSD 2 was at 84% 3h ago, and OSD 3 was ~75%.

    During rbd rm (that took a bit more that 3 days), ceph log was
    showing things like that:

    2014-09-03 16:17:38.831640 mon.0 192.168.1.71:6789/0 417194 :
      [INF] pgmap v14953987: 3196 pgs: 2882 active+clean, 314
      active+remapped; 7647 GB data, 11067 GB used, 3828 GB / 14896 GB
      avail; 0 B/s rd, 6778 kB/s wr, 18 op/s; -5/5757286 objects
      degraded (-0.000%)

      [...]

      2014-09-05 03:09:59.895507 mon.0 192.168.1.71:6789/0 513976 :
      [INF] pgmap v15050766: 3196 pgs: 2882 active+clean, 314
      active+remapped; 6010 GB data, 11156 GB used, 3740 GB / 14896 GB
      avail; 0 B/s rd, 0 B/s wr, 8 op/s; -388631/5247320 objects
      degraded (-7.406%)

    [...]

    2014-09-06 03:56:50.008109 mon.0 192.168.1.71:6789/0 580816
      : [INF] pgmap v15117604: 3196 pgs: 2882 active+clean, 314
      active+remapped; 4865 GB data, 11207 GB used, 3689 GB / 14896 GB
      avail; 0 B/s rd, 6117 kB/s wr, 22 op/s; -706519/3699415 objects
      degraded (-19.098%)

    2014-09-06 03:56:44.476903 osd.0 192.168.1.71:6805/11793
      729 : [WRN] 1 slow requests, 1 included below; oldest blocked for
      > 30.058434 secs

    2014-09-06 03:56:44.476909 osd.0 192.168.1.71:6805/11793
      730 : [WRN] slow request 30.058434 seconds old, received at
      2014-09-06 03:56:14.418429: osd_op(client.19843278.0:46081
      rb.0.c7fd7f.238e1f29.00000000b3fa [delete] 15.b8fb7551
      ack+ondisk+write e38950) v4 currently waiting for blocked object

    2014-09-06 03:56:49.477785 osd.0 192.168.1.71:6805/11793
      731 : [WRN] 2 slow requests, 1 included below; oldest blocked for
      > 35.059315 secs

    [... stabilizes here:]

    2014-09-06 22:13:48.771531 mon.0 192.168.1.71:6789/0 632527
      : [INF] pgmap v15169313: 3196 pgs: 2882 active+clean, 314
      active+remapped; 4139 GB data, 11215 GB used, 3681 GB / 14896 GB
      avail; 64 B/s rd, 64 B/s wr, 0 op/s; -883219/3420796 objects
      degraded (-25.819%)

    [...]

    2014-09-07 03:09:48.491325 mon.0 192.168.1.71:6789/0 633880
      : [INF] pgmap v15170666: 3196 pgs: 2882 active+clean, 314
      active+remapped; 4139 GB data, 11215 GB used, 3681 GB / 14896 GB
      avail; 18727 B/s wr, 2 op/s; -883219/3420796 objects degraded
      (-25.819%)

    And now, during data movement I described before:

    2014-09-20 15:16:13.394694 mon.0 [INF] pgmap v15344707: 3196
      pgs: 2132 active+clean, 432 active+remapped+wait_backfill, 621
      active+remapped, 11 active+remapped+backfilling; 4139 GB data,
      6831 GB used, 5271 GB / 12103 GB avail; 379097/3792969 objects
      degraded (9.995%)

    If some ceph developer wants me to do something or to provide some
    data, please say so quickly, I will probably process OSD 3 in
    ~16-20h.

    (of course, I'd prefer not loose the data btw :-))

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com