Re: [PATCH] libceph: Complete stuck requests to OSD with EIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jef,

On Fri, 2017-02-10 at 14:31, Jeff Layton wrote:
On Thu, 2017-02-09 at 16:04 +0300, Artur Molchanov wrote:
From: Artur Molchanov <artur.molchanov@xxxxxxxxxx>

Complete stuck requests to OSD with error EIO after osd_request_timeout expired.
If osd_request_timeout equals to 0 (default value) then do nothing with
hung requests (keep default behavior).

Create RBD map option osd_request_timeout to set timeout in seconds. Set
osd_request_timeout to 0 by default.

Also, what exactly are the requests blocked on when this occurs? Is the
ceph_osd_request_target ending up paused? I wonder if we might be better
off with something that returns a hard error under the circumstances
where you're hanging, rather than depending on timeouts.

I wonder that it is better to complete requests only after timeout expired, just because a request can fail due to temporary network issues (e.g. router restarted) or restarting machine/services.

Having a job that has to wake up every second or so isn't ideal. Perhaps
you would be better off scheduling the delayed work in the request
submission codepath, and only rearm it when the tree isn't empty after
calling complete_osd_stuck_requests?

Would you please tell me more about rearming work only if the tree is not empty after calling complete_osd_stuck_requests? From what code we should call complete_osd_stuck_requests?

As I understood, there are two primary cases:
1 - Requests to OSD failed, but monitors do not return new osdmap (because all monitors are offline or monitors did not update osdmap yet). In this case requests are retried by cyclic calling ceph_con_workfn -> con_fault -> ceph_con_workfn. We can check request timestamp and does not call con_fault but complete it.

2 - Monitors return new osdmap which does not have any OSD for RBD.
In this case all requests to the last ready OSD will be linked on "homeless" OSD and will not be retried until new osdmap with appropriate OSD received. I think that we need additional periodic checking timestamp such requests.

Yes, there is already existing job handle_timeout. But the responsibility of this job is to sending keepalive requests to slow OSD. I'm not sure that it is a good idea to perform additional actions inside this job.
I decided that creating specific job handle_osd_request_timeout is more applicable.

This job will be run only once with a default value of osd_request_timeout (0).
At the same time, I think that user will not use too small value for this parameter. I wonder that typical value will be about 1 minute or greater.

Also, I don't see where this job is ever cancelled when the osdc is torn
down. That needs to occur or you'll cause a use-after-free oops...

It is my fault, thanks for the correction.

--
Artur
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux