On Fri, Feb 20, 2015 at 2:21 AM, Mike Christie <mchristi@xxxxxxxxxx> wrote: > On 02/18/2015 06:05 PM, "Sebastian Köhler [Alfahosting GmbH]" wrote: >> Hi, >> >> yesterday we had had the problem that one of our cluster clients >> remounted a rbd device in read-only mode. We found this[1] stack trace >> in the logs. We investigated further and found similar traces on all >> other machines that are using the rbd kernel module. It seems to me that >> whenever there is a swapping situation on a client those I/O errors occur. >> Is there anything we can do or is this something that needs to be fixed >> in the code? > > Hi, > > I was looking at that code the other day and was thinking rbd.c might > need some changes. > > 1. We cannot use GFP_KERNEL in the main IO path (requests that are sent > down rbd_request_fn and related helper IO), because the allocation could > come back on rbd_request_fn. > 2. We should use GFP_NOIO instead of GFP_ATOMIC if we have the proper > context and are not holding a spin lock. Hi Mike, Yeah, I have a similar half-baked patch in one of my dev branches and there is an old ticket for this - http://tracker.ceph.com/issues/4233. > 3. We should be using a mempool or preallocate enough mem, so we can > make forward progress on at least one IO at a time. There is a mempool for osd requests already, but rbd doesn't use it. AFAIR it's used by cephfs for aops->writepage and aops->writepages callbacks only. > > I started to make the attached patch (attached version is built over > linus's tree today). I think it can be further refined, so we pass in > the gfp_t to some functions, because I think in some cases we could use > GFP_KERNEL and/or we do not need to use the mempool. For example, I do > not think we could use GFP_KERNEL and not use the mempool in the > rbd_obj_watch_request_helper code paths. rbd_obj_watch_request_helper() is supposed to be called only during map/unmap (read rbd device setup/teardown), so we are probably OK there. > > I was not done with evaluating all the paths, so had not yet posted it. > Patch is not tested. > > Hey Ilya, I was not sure about the layered related code. I thought > functions like rbd_img_obj_parent_read_full could get called as a result > of a IO getting sent down the rbd_request_fn, but was not 100% sure. I > meant to test it out, but have been busy with other stuff. Yes, rbd_img_obj_parent_read_full() and others are called to serve IO, effectively from rbd_request_fn(). I guess I'll that ticket to myself and work on it next. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com