Re: RBD corruption when removing tier cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

today I continued with my investigation. and maybe somebody will be interested with my research, so I'm sending it here.

I compared object in hot pool with object in cold pool and they were the same so I removed cache tier from cold pool.

Then I tried to fsck my rbd image using libvirt virtual with booted rescue cd.

I was successful only with read-only mount and with not replaying journal (mount -o ro,noload)

I noticed, that I'm getting IO errors on the disk.

sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
sd 2:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 00 00 08 08 00 00 10 00
blk_update_request: 4 callbacks suppressed
blk_update_request: I/O error, dev sda, sector 2056
buffer_io_error: 61 callbacks suppressed
Buffer I/O error on dev sda1, logical block 1, lost async page write
Buffer I/O error on dev sda1, logical block 2, lost async page write
VFS: Dirty inode writeback failed for block device sda1 (err=-5).

I wanted to write to that block manually. To be sure I created rbd snapshot of that filesystem and after I created it, problems disappeared.

After creating snapshot I was able to fsck that filesystem, replay ext4 jorunal.

It looks, that objects in cold pool were locked somehow so they cannot be modified? After snapshot they changed name and modification was possible? Can I debug it somehow?

I continued with cleaning hot pool, i tried to delete objects. Delete operation succeeded with rados rm, but some objects stayed there and I couldn't delete or get them anymore.

rados -p hot ls


rbd_data.9c000238e1f29.0000000000000000
rbd_data.9c000238e1f29.0000000000000621
rbd_data.9c000238e1f29.0000000000000001
rbd_data.9c000238e1f29.0000000000000a2c
rbd_data.9c000238e1f29.0000000000000200
rbd_data.9c000238e1f29.0000000000000622
rbd_data.9c000238e1f29.0000000000000009
rbd_data.9c000238e1f29.0000000000000208
rbd_data.9c000238e1f29.00000000000000c1
rbd_data.9c000238e1f29.0000000000000625
rbd_data.9c000238e1f29.00000000000000d8
rbd_data.9c000238e1f29.0000000000000623
rbd_data.9c000238e1f29.0000000000000624

rados -p hot rm
error removing hot>rbd_data.9c000238e1f29.0000000000000000: (2) No such file or directory

How to cleanup that pool? What could happen to that pool?


After some additional tests I think, that my initial problem caused switching cache mode to forward, so I recommend not only warn, like it is now when using that mode, but also to change official webpage

http://docs.ceph.com/docs/master/rados/operations/cache-tiering/

and find some other ways to flush all objects (like turn off VMs, set short time to evict or target size) and remove overlay after that.

With regards
Jan Pekar

On 1.12.2017 03:43, Jan Pekař - Imatic wrote:
Hi all,
today I tested adding SSD cache tier to pool.
Everything worked, but when I tried to remove it and run

rados -p hot-pool cache-flush-evict-all

I got

         rbd_data.9c000238e1f29.0000000000000000
failed to flush /rbd_data.9c000238e1f29.0000000000000000: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000621
failed to flush /rbd_data.9c000238e1f29.0000000000000621: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000001
failed to flush /rbd_data.9c000238e1f29.0000000000000001: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000a2c
failed to flush /rbd_data.9c000238e1f29.0000000000000a2c: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000200
failed to flush /rbd_data.9c000238e1f29.0000000000000200: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000622
failed to flush /rbd_data.9c000238e1f29.0000000000000622: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000009
failed to flush /rbd_data.9c000238e1f29.0000000000000009: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000208
failed to flush /rbd_data.9c000238e1f29.0000000000000208: (2) No such file or directory
         rbd_data.9c000238e1f29.00000000000000c1
failed to flush /rbd_data.9c000238e1f29.00000000000000c1: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000625
failed to flush /rbd_data.9c000238e1f29.0000000000000625: (2) No such file or directory
         rbd_data.9c000238e1f29.00000000000000d8
failed to flush /rbd_data.9c000238e1f29.00000000000000d8: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000623
failed to flush /rbd_data.9c000238e1f29.0000000000000623: (2) No such file or directory
         rbd_data.9c000238e1f29.0000000000000624
failed to flush /rbd_data.9c000238e1f29.0000000000000624: (2) No such file or directory
error from cache-flush-evict-all: (1) Operation not permitted

I also notice, that switching cache tier to "forward" is not safe?

Error EPERM: 'forward' is not a well-supported cache mode and may corrupt your data.  pass --yes-i-really-mean-it to force.

In the moment of flushing (or switching to forward mode) RBD got corrupted and even fsck was unable to repair it (unable to set superblock flags). I don't know if it is due to cache still active and corrupted or ext4 got messed, that it cannot work anymore.

Even if VM that was using that pool is stopped I cannot flush it.

So what I did wrong? Can I get my data back? Is it safe to remove tier cache and how?

Using rados get I can dump objects to disk, but why I cannot flush it (evict)?

It looks like the same issue as on
http://tracker.ceph.com/issues/12659
but it is unresolved.

I also have some snapshot of RBD image in the cold pool, but that should not cause problems in production.

I'm using 12.2.1 version on all 4 nodes.

With regards
Jan Pekar
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
============
Ing. Jan Pekař
jan.pekar@xxxxxxxxx | +420603811737
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz
============
--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux