Data moved pools but didn't move osds & backfilling+remapped loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I've got an issue with the data in our pool. A RBD image containing 4TB+ data has moved over to a different pool after a crush rule set change, which should not be possible. Besides that it loops over and over to start remapping and backfilling (goes up to 377 pg active+clean then suddenly drops to 361, without crashes accourding to ceph -w & ceph crash ls)

First about the pools:

[root@CEPH-MGMT-1 ~t]# ceph df
RAW STORAGE:
    CLASS        SIZE       AVAIL      USED        RAW USED     %RAW USED
    cheaphdd     16 TiB     10 TiB     5.9 TiB      5.9 TiB         36.08
    fasthdd      33 TiB     18 TiB      16 TiB       16 TiB         47.07
    TOTAL        50 TiB     28 TiB      22 TiB       22 TiB         43.44

POOLS:
    POOL             ID     STORED      OBJECTS     USED         %USED     MAX AVAIL
    pool1              37       780 B            1.33M          780 B               0           3.4 TiB
    pool2              48     2.0 TiB           510.57k        5.9 TiB          42.64       2.6 TiB

All data is now in pool2 while the RBD image is created in pool1 (since pool2 is new).

The steps it took to make ceph do this is:

- Add osds with a different device class (class cheaphdd)
- Create crushruleset for cheaphdd only called cheapdisks
- Create pool2 with new crush rule set
- Remove device class from the previously existing devices (remove class hdd)
- Add class fasthdd to those devices
- Create new crushruleset fastdisks
- Change crushruleset for pool1 to fastdisks

After this the data starts moving everything from pool1 to pool2, however, the RBD image still works and the disks of pool1 are still filled with data.

I've tried to reproduce this issue using virtual machines but I couldn't make it happen again.

Some extra information:
ceph osd crush tree --show-shadow ==> https://fe.ax/639aa.H34539.txt
ceph pg ls-by-pool pool1 ==> https://fe.ax/dcacd.H44900.txt (I know the PG count is too low)
ceph pg ls-by-pool pool2 ==> https://fe.ax/95a2c.H51533.txt
ceph -s ==> https://fe.ax/aab41.H69711.txt


Can someone shine a light on why the data looks like it's moved to another pool and/or explain why the data in pool2 keeps remapping/backfilling in a loop?

Thanks!


Kind regards,

Marco Stuurman
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux