Re: OSD deadlock with cephfs client and OSD on same machine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 29 May 2012 you wrote:
> On Tue, 29 May 2012, Amon Ott wrote:
> > Conclusion: If you want to run OSD and cephfs kernel client on the same
> > Linux server and have a libc6 before 2.14 (e.g. Debian's newest in
> > experimental is 2.13) or a kernel before 2.6.39, either do not use ext4
> > (but btrfs is still unstable) or risk data loss by missing syncs through
> > the workaround of forcing filestore_fsync_flushes_journal_data to true.
>
> Note that fsync_flushed_journal_data should only be set to true with ext3
> and the 'data=ordered' or 'data=journal' mount option.  It is an
> implementation artifact only that fsync() will flush all previous writes.

I am fully aware of that, this is why I mentioned the risk of data loss.

> > Please consider putting out a fat warning at least at build time, if
> > syncfs() is not available, e.g. "No syncfs() syscall, please expect a
> > deadlock when running osd on non-btrfs together with a local cephfs
> > mount." Even better would be a quick runtime test for missing syncfs()
> > and storage on non-btrfs that spits out a warning, if deadlock is
> > possible.
>
> I think a runtime warning makes more sense; nobody will see the build time
> warning (e.g., those installed debs).

Yes, fully agreed.

> > As a side effect, the experienced lockup seems to be a good way to
> > reproduce the long standing bug 1047 - when our cluster tried to recover,
> > all MDS instances died with those symptoms. It seems that a partial sync
> > of journal or data partition causes that broken state.
>
> Interesting!  If you could also note on that bug what the metadata
> workload was (what was making hard links?), that would be great!

We are auto creating up to 200 preconfigured home directories on all four 
nodes, each home dir consists of ca. 400 dirs and files with ca. 16 MB of 
data. AFAIK, there are no hard links involved. So it is a massive parallel 
creation of many small files, probably lots of metadata for them.

Will put that as note to the bug, too.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Am Köllnischen Park 1    Fax: +49 30 24342336
10179 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux