On Mon, 11 Feb 2019, Jeff Layton wrote: > On Mon, 2019-02-11 at 09:22 +0100, Dan van der Ster wrote: > > Hi all, > > > > Does anyone know if ceph and level/rocksdb are already immune to these > > fsync issues discovered by the postgresql devs? > > > > https://fosdem.org/2019/schedule/event/postgresql_fsync/ > > https://wiki.postgresql.org/wiki/Fsync_Errors > > https://www.postgresql.org/message-id/flat/CAMsr%2BYHh%2B5Oq4xziwwoEfhoTZgr07vdGG%2Bhu%3D1adXx59aTeaoQ%40mail.gmail.com > > > > Cheers, Dan > > Great question. I took a brief look at the rocksdb code but wasn't able > to draw a meaningful conclusion there. > > I do see that you can set it up to use O_DIRECT, but it's not clear to > me that propagates fsync errors in a meaningful way if you don't. I'm > also not sure how ceph configures rocksdb to operate here either. > > I think it'd be good to reach out to the rocksdb developers and see > whether they've considered its behavior in the face of a writeback > failure. I'm happy to discuss with them if they have questions about the > kernel's behavior. Looking at the filestore code, I see that WBThrottle isn't checking hte fsync(2) return value! That's an easy fix (we should assert/panic). Opened The bluestore code (os/bluestore/KernelDevice) looks fine (there is a single call to fdatasync(2) and we abort on any error). sage