Assuming you mean a Ceph bug for filesystem deadlock (part of that conversation went off-list, it looks like?), there isn't one. Ceph already uses syncfs if it's available, and on btrfs it justs uses ioctls. But if it can't do that it needs to call sync(), on both the monitor and the OSD. This *will* break things if you have a client mount: 1) the OSD calls sync() 2) the VM tells the client to sync 3) the client tries to flush its data out to the OSD and waits until it's safe 4) the OSD waits until it can sync the data to disk before replying that it's safe -- which it can't do because it's still waiting on its sync to finish The monitor can trigger this loop by calling sync() itself, although in the common case the client doesn't need to talk to the monitor to do its own sync() so it will appear to work (you don't want to create a product with this assumption though, because it will deadlock eventually -- either make syncfs() work, back the monitor with btrfs, or separate your daemons from your clients). The only reason that cfuse isn't susceptible to this problem is because FUSE doesn't let you wire up sync() (maybe to avoid exactly this problem?). In your specific case, with a btrfs-backed OSD, I *think* this actually won't cause things to break because the OSD can set up independent syncs. But the underlying problem of syncing can't be "fixed" any more than it already is without breaking our consistency guarantees, so no bug number. In closing: you hit some kind of issue with xfs or your IO subsystem and shouldn't be running into any trouble with sync deadlocks right now — but eventually you will and we can't make it better. -Greg (Hopefully this email still makes sense; I rewrote it several times trying to figure out what was going on with ceph-fuse!) On Mon, Nov 7, 2011 at 11:07 AM, Mandell Degerness <mandell@xxxxxxxxxxxxxxx> wrote: > Can someone give me the bug number for this? > > On Sat, Nov 5, 2011 at 7:48 PM, Alexandre Oliva <oliva@xxxxxxxxxxxxxxxxx> wrote: >> On Nov 5, 2011, Mandell Degerness <mandell@xxxxxxxxxxxxxxx> wrote: >> >>> Yes, we are using kernel module for ceph and there was a posix file >>> system and an RBD mounted on the node at the time. The monitor is not >>> using either for it's data though. >> >> It doesn't matter. The monitor calls sync() quite often, and that waits >> for *all* filesystems to flush, including the ceph.ko mount, thus the >> potential deadlock. It can use syncfs() if that's available in kernel >> and glibc, but I'm not sure that's enough to work around this particular >> deadlock scenario. As I found out the hard way, there are others in the >> osd as well, so if you want to mount -o rw on a mon or osd, use the fuse >> client, or virtualize the mount (never tested this to make sure it >> actually addresses the problem). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html