Re: SSD-Cache Tier + RBD-Cache = Filesystem corruption?

Christian Balzer <chibi@xxxxxxx> · Tue, 9 Feb 2016 11:27:06 +0900

Hello,

I'm quite concerned by this (and the silence from the devs), however there
are a number of people doing similar things (at least with Hammer) and
you'd think they would have been bitten by this if it were a systemic bug. 

More below.

On Sat, 6 Feb 2016 11:31:51 +0100 Udo Waechter wrote:

> Hello,
> 
> I am experiencing totally weird filesystem corruptions with the
> following setup:
> 
> * Ceph infernalis on Debian8
Hammer here, might be a regression.
> * 10 OSDs (5 hosts) with spinning disks
> * 4 OSDs (1 host, with SSDs)
> 
So you're running your cache tier host with replication of 1, I presume?
What kind of SSDs/FS/other relevant configuration options?
Could there be simply some corruption on the SSDs that is of course then
presented to the RDB clients eventually?

> The SSDs are new in my setup and I am trying to setup a Cache tier.
> 
> Now, with the spinning disks Ceph is running since about a year without
> any major issues. Replacing disks and all that went fine.
> 
> Ceph is used by rbd+libvirt+kvm with
> 
> rbd_cache = true
> rbd_cache_writethrough_until_flush = true
> rbd_cache_size = 128M
> rbd_cache_max_dirty = 96M
> 
> Also, in libvirt, I have
> 
> cachemode=writeback enabled.
> 
> So far so good.
> 
> Now, I've added the SSD-Cache tier to the picture with "cache-mode
> writeback"
> 
> The SSD-Machine also has "deadline" scheduler enabled.
> 
> Suddenly VMs start to corrupt their filesystems (all ext4) with "Journal
> failed".
> Trying to reboot the machines ends in "No bootable drive"
> Using parted and testdisk on the image mapped via rbd reveals that the
> partition table is gone.
> 
Did turning the cache explicitly off (both Ceph and qemu) fix this?

> testdisk finds the proper ones, e2fsck repairs the filesystem beyond
> usage afterwards.
> 
> This does not happen to all machines, It happens to those that actually
> do some or most fo the IO
> 
> elasticsearch, MariaDB+Galera, postgres, backup, GIT
> 
> So I thought, yesterday one of my ldap-servers died, and that one is not
> doing IO.
> 
> Could it be that rbd caching + qemu writeback cache + ceph cach tier
> writeback are not playing well together?
> 
> I've read through some older mails on the list, where people had similar
> problems and suspected somehting like that.
> 
Any particular references (URLs, Message-IDs)?

Regards,

Christian

> What are the proper/right settings for rdb/qemu/libvirt?
> 
> libvirt: cachemode=none (writeback?)
> rdb: cache_mode = none
> SSD-tier: cachemode: writeback
> 
> ?
> 
> Thanks for any help,
> udo.
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com