Hi Satoru,
sounds good, thanks a in advance!
Just in case below is how updates umount() function should look like:
void BlueFS::umount(bool avoid_compact)
{
dout(1) << __func__ << dendl;
sync_metadata(avoid_compact);
if (cct->_conf->bluefs_check_volume_selector_on_umount) {
_check_vselector_LNF();
}
_close_writer(log.writer);
log.writer = NULL;
log.t.clear();
_flush_bdev();
_stop_alloc();
nodes.file_map.clear();
nodes.dir_map.clear();
super = bluefs_super_t();
_shutdown_logger();
}
Hi Igor,
2022年11月24日(木) 1:08 Igor Fedotov <igor.fedotov@xxxxxxxx>:
Hi Cephers,
The following discussion is inspired by https://tracker.ceph.com/issues/54019 "OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error".
We had brief discussion in a private chat with Adam a while back but IMO it worth wider audience.
Generally given what we have in the ticket it looks like BlueStore's mkfs might not properly flush device(s) on completion. Points for that are:
- one can see garbage/all zeros when reading from disk after such a failure. And apparently OS cache drop brings valid data to its place.
- setting bluefs_buffered_io = false apparently fixes the issue
Given the BlueFS::umount() implementation below shouldn't we just call flush_bdev() explicitly after _close_writer(log.writer) call?
IIUC both sync_metadata and close_write() do not fully guarantee they've flushed everything so another flush_bdev call might be needed. Anyway it wouldn't harm IMO.
What do you think?
@Satoru - I'm curious if you can try a custom patch with a potential fix as you're able to consistently reproduce the issue?
Thank you for your effort. I'll try it. I'll also ask rook users who hit this problem to try this fix.
Thanks,Satoru
void BlueFS::umount(bool avoid_compact)
{
dout(1) << __func__ << dendl;
sync_metadata(avoid_compact);
if (cct->_conf->bluefs_check_volume_selector_on_umount) {
_check_vselector_LNF();
}
_close_writer(log.writer);
log.writer = NULL;
log.t.cleair();
vselector.reset(nullptr);
_stop_alloc();
nodes.file_map.clear();
nodes.dir_map.clear();
super = bluefs_super_t();
_shutdown_logger();
}
Igor Fedotov
Ceph Lead Developer
--
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web | LinkedIn | Youtube | Twitter
Meet us at the SC22 Conference! Learn more
Technology Fast50 Award Winner by Deloitte!
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx