Hey Ceph users,
we are currently facing some serious problems on our Ceph Cluster with
libvirt (KVM), RBD devices and FSTRIM running inside VMs.
The problem is right after running the fstrim command inside the VM the
ext4 filesystem is corrupted and read-only with the following error message:
EXT4-fs error (device sda1): ext4_mb_generate_buddy:756: group 136,
block bitmap and bg descriptor inconsistent: 32200 vs 32768 free clusters
Aborting journal on device sda1-8
EXT4-fs (sda1): Remounting filesystem read-only
EXT4-fs error (device sda1): ext4_journal_check_start:56: Detected
aborted journal
EXT4-fs (sda1): Remounting filesystem read-only
This behavior is reproducible across several VMs with different OS
(Ubuntu 14.04, 16.04 and 18.04) so we guess it is a bug or a
configuration problem regarding RBD devices.
Our setup on the hosts running the VMs looks like:
# lsb_release -d
Description: Ubuntu 20.04 LTS
# uname -a
Linux XXX 5.4.0-37-generic #41-Ubuntu SMP Wed Jun 3 18:57:02 UTC 2020
x86_64 x86_64 x86_64 GNU/Linux
# ceph --version
ceph version 15.2.3 (d289bbdec69ed7c1f516e0a093594580a76b78d0) octopus
(stable)
-> I know there's the update to Ceph 15.2.4 but I haven't seen any
fstrim/discard related changes in the changelog. If we could fix the
problem with 15.2.4 I would be happy...
The libvirt config for the RBD device with supporting fstrim (discard)
is the following:
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='directsync' io='native'
discard='unmap'/>
<auth username='libvirt'>
<secret type='ceph' usage='client.libvirt'/>
</auth>
<source protocol='rbd' name='cephstorage/testtrim_system'>
<host name='XXX' port='6789'/>
<host name='XXX' port='6789'/>
<host name='XXX' port='6789'/>
<host name='XXX' port='6789'/>
<host name='XXX' port='6789'/>
</source>
<target dev='sda' bus='scsi'/>
<boot order='2'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
The ceph docs (https://docs.ceph.com/docs/octopus/rbd/qemu-rbd/) gave me
some hints about enabling trim/discard and I tested using 4M as discard
granularity, but I got the same error resulting in a corrupted ext4 file
system.
Changes made to the libvirt config:
<qemu:commandline>
<qemu:arg value='-set'/>
<qemu:arg value='device.scsi0-0-0-0.discard_granularity=4194304'/>
</qemu:commandline>
As the RBD devices are thin-provisioned we really need calling fstrim
inside the VM regularly to free up unused blocks, otherwise our Ceph
pool will run out of space.
Any ideas what could be wrong with our RBD setup or can somebody else
reproduce the problem?
Any hints on how to debug this problem?
Any related/open Ceph issues? (I could not fined one)
Thanks a lot for your help, Georg
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx