Re: rbd: I/O Errors in low memory situations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 20, 2015 at 2:21 AM, Mike Christie <mchristi@xxxxxxxxxx> wrote:
> On 02/18/2015 06:05 PM, "Sebastian Köhler [Alfahosting GmbH]" wrote:
>> Hi,
>>
>> yesterday we had had the problem that one of our cluster clients
>> remounted a rbd device in read-only mode. We found this[1] stack trace
>> in the logs. We investigated further and found similar traces on all
>> other machines that are using the rbd kernel module. It seems to me that
>> whenever there is a swapping situation on a client those I/O errors occur.
>> Is there anything we can do or is this something that needs to be fixed
>> in the code?
>
> Hi,
>
> I was looking at that code the other day and was thinking rbd.c might
> need some changes.
>
> 1. We cannot use GFP_KERNEL in the main IO path (requests that are sent
> down rbd_request_fn and related helper IO), because the allocation could
> come back on rbd_request_fn.
> 2. We should use GFP_NOIO instead of GFP_ATOMIC if we have the proper
> context and are not holding a spin lock.

Hi Mike,

Yeah, I have a similar half-baked patch in one of my dev branches and
there is an old ticket for this - http://tracker.ceph.com/issues/4233.

> 3. We should be using a mempool or preallocate enough mem, so we can
> make forward progress on at least one IO at a time.

There is a mempool for osd requests already, but rbd doesn't use it.
AFAIR it's used by cephfs for aops->writepage and aops->writepages
callbacks only.

>
> I started to make the attached patch (attached version is built over
> linus's tree today). I think it can be further refined, so we pass in
> the gfp_t to some functions, because I think in some cases we could use
> GFP_KERNEL and/or we do not need to use the mempool. For example, I do
> not think we could use GFP_KERNEL and not use the mempool in the
> rbd_obj_watch_request_helper code paths.

rbd_obj_watch_request_helper() is supposed to be called only during
map/unmap (read rbd device setup/teardown), so we are probably OK
there.

>
> I was not done with evaluating all the paths, so had not yet posted it.
> Patch is not tested.
>
> Hey Ilya, I was not sure about the layered related code. I thought
> functions like rbd_img_obj_parent_read_full could get called as a result
> of a IO getting sent down the rbd_request_fn, but was not 100% sure. I
> meant to test it out, but have been busy with other stuff.

Yes, rbd_img_obj_parent_read_full() and others are called to serve IO,
effectively from rbd_request_fn().

I guess I'll that ticket to myself and work on it next.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux