Apparent cephfs bugs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Is this mailing list an appropriate place to report apparent cephfs bugs? I haven't gotten traction over on ceph-users.

A couple of weeks ago we attempted to switch from Lustre to Cephfs for our compute cluster shared file system but had to roll back because users began reporting problems:

1) Some writes failing silently, resulting in 0-size files.
2) Some writes hanging indefinitely. In my experiments first 4MB (4194304B) would be written out fine, but then the process would get stuck.

I've been generally unable to trigger these bugs, except (2) which seems to affect only some systems, but can be reproduced every time on an impacted system, at least for a while.

We have close to 200 kernel cephfs clients (trying fuse mounts resulted in hangs). They mostly run kernels between 3.10.0-957.27.2.el7 and 3.10.0-1160.62.1.el7. A few machines have 4.18.0-348.20.1.el8_5.

The cluster is running 16.2.7, consists of 20 OSD servers with 24-26 disks. Cephfs metadata pool is stored across 12 OSDs backed by NVMe flash on 3 servers. Single MDS daemon.

Do the problems we've experienced sound like any known bugs?

The MDS was complaining about slow IO when users were experiencing issues. Could this explain empty files?


Vlad
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux