Re: Failing to mount PVCs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm not entirely sure if this really is the same issue here. One of our customers also works with k8s in openstack and I saw similar messages. We never investigated it, I don't know if the customer did, but one thing they encountered was that k8s didn't properly clean up detached/deleted volumes before reattaching them or attaching new ones. This sometimes results in the same volume apparently attached multiple times to the same VM. According to the customer this was due to the cinder driver in k8s, but as I said, I'm not sure. But this message "Multiply-claimed block(s) in inode" reminded me of that. The oom killers are also familiar in this environment.

The current workaround is to properly clean up attachments before reattaching again. And to use flavors with more resources to prevent oom killers.

This doesn't help much, but at least you know that you're not alone. ;-)


Zitat von Fatih Ertinaz <fertinaz@xxxxxxxxx>:

Hi,

We recently started to observe issues similar to the following in our
cluster environment:

Warning FailedMount 31s (x8 over 97s) kubelet, ${NODEIP}  MountVolume.SetUp
failed for volume "${PVCNAME}" : mount command failed, status: Failure,
reason: failed to mount volume /dev/rbd2 [ext4] to /var/lib/kubelet/plugins/
rook.io/rook-ceph/mounts/${PVCNAME}, error 'fsck' found errors on device
/dev/rbd2 but could not correct them: fsck from util-linux 2.23.2
/dev/rbd2 contains a file system with errors, check forced.
/dev/rbd2: Inode 2884174 has an invalid extent node (blk 11567229, lblk 0)
/dev/rbd2: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.

The exact error (inode has an invalid extent node) may differ, two other
ones I've seen are "Multiply-claimed block(s) in inode" and "Unattached
inode".

This is a private cloud environment with kubernetes (1.13) and ceph. As far
as I know, worker nodes haven't been rebooted for the past 6 months.
However, I saw some oom killer messages in the logs.

In general, has anyone else seen similar errors before and any ideas about
what might be the root cause? There is a workaround I applied which seemed
to resolve the issue temporarily (rbd map to a new device and run fsck on
it), but I'd very much like to prevent these from happening in the first
place.

Thank you,

Fatih Ertinaz
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux