BlueStore _txc_add_transaction errors (possibly related to bug #38724)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

it seems there have been several reports in the past related to
BlueStore OSDs crashing from unhandled errors in _txc_add_transaction:

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-April/034444.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032172.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-December/031960.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-December/031964.html

Bug #38724 tracks this, has been fixed in master with
https://github.com/ceph/ceph/pull/27929, and is pending backports (and,
I dare say, is *probably* misclassified as being only minor, as this
does cause potential data loss as soon as it affects enough OSDs
simultaneously):

https://tracker.ceph.com/issues/38724

We just ran into a similar issue with a couple of BlueStore OSDs that we
recently added to a Luminous (12.2.12) cluster that was upgraded from
Jewel, and hence, still largely runs on FileStore. I say similar because
evidently other people reporting this problem have been running into
ENOENT (No such file or directory) or ENOTEMPTY (Directory not empty);
for us it's interestingly E2BIG (Argument list too long):

https://tracker.ceph.com/issues/38724#note-26

So I'm wondering if someone could shed light on these questions:

* Is this the same issue as that which
https://github.com/ceph/ceph/pull/27929 fixes?

* Thus, since https://github.com/ceph/ceph/pull/29115 (the Nautilus
backport for that fix) has been merged, but is not yet included in a
release, do *Nautilus* users get a fix in the upcoming 14.2.3 release,
and once they update, would this bug go away with no further
intervention required?

* For users on *Luminous*, since https://tracker.ceph.com/issues/39694
(the Luminous version of 38724) says "non-trivial backport", is it fair
to say that a fix might still take a while for that release?

* Finally, are Luminous users safe from this bug if they keep using, or
revert to, FileStore?

Thanks in advance for your thoughts! Please keep Erik CC'd on your reply.

Cheers,
Florian
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux