Re: ceph-mon blocked error

Tommi Virtanen <tommi.virtanen@xxxxxxxxxxxxx> · Mon, 7 Nov 2011 15:34:14 -0800

On Mon, Nov 7, 2011 at 15:10, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> It applies to RBD too... _if_ the ceph-osd process is calling sync(2).
> On btrfs it doesn't, and on XFS/extN/etc., it only does on older kernels
> with older glibc.  New kernels (.39+) and new glibc have syncfs(2), which
> syncs only the fs the ceph-osd is serving up.
>
> http://linux.die.net/man/2/syncfs

I'm willing to believe syncfs(2) makes the deadlock more rare, but
isn't it still possible?

e.g. see slide 23 of www.scs.stanford.edu/nyu/02fa/notes/l3.pdf for a
description of this in context of loopback NFS. If rbd.ko does write
caching, and uses buffer cache for it:

1. ceph-osd reads a file from local disk
2. the fs needs to allocate a buffer
3. vm chooses dirty buffer in rbd cache to flush
4. rbd.ko blocks waiting for ceph-osd, which is waiting to read the
file from disk in step 1.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html