Re: [PATCH] ceph: make osd_request_timeout changable online in debugfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[resend because of a SMTP error, please ignore this if you have received it.......]

Hi Ilya,
I think there is no conflict between this patch and -o full-force. We can use them
in different use cases.

(1), This patch is simple.
When we are going to fix the problem of umounting fs and unmap device but the ceph cluster is unavailable in production, we want the logic to be as simple as possible, which will introduce regression with a very little possibility. Especially when we need to backport
commits to stable branches.

(2), When we don't want to change the original logical of user applications.
Let's compare the work we need to do in higher-level applications, if we are going to use full-force to solve the problem, we need to change the user applications, for example, in k8s, it's going to umount fs at first and then detachdisk. That's not easy to change the framework
of it.

(3), When we don't want to implement another "timeout and retry with full-force" As what we discussed about the full-force, IIUC, we don't have to use full-force at first, but we should try it with normal way, and retry with full-force when a timedout. For example, you mentioned, we can retry when we got a specified Signal in systemd shuting down. But in some other use case, we have to implement this timeout and retry mechanism.

And yes, there is some other cases this patch is not suitable, for example, when the system don't have debugfs mounted.

So I think we can merge this patch into upstream, but continue to implement full-force.

What do you think?

Thanx
Dongsheng

On 05/24/2018 11:27 AM, Dongsheng Yang wrote:
Default value of osd_request_timeout is 0 means never timeout,
and we can set this value in rbd mapping with -o "osd_request_timeout=XX".
But we can't change this value online.

[Question 1]: Why we need to set osd_request_timeout?
When we are going to reboot a node which have krbd devices mapped,
even with the rbdmap.service enabled, we will be blocked
in shuting down if the ceph cluster is not working.

Especially, If we have three controller nodes which is running as ceph mon,
but at the same time, there are some k8s pod with krbd devices on this nodes,
then we can't shut down the last controller node when we want to shutdown
all nodes, because when we are going to shutdown last controller node, the
ceph cluster is not reachable.

[Question 2]: Why don't we use rbd map -o "osd_request_timeout=XX"?
We don't want to set the osd_request_timeout in rbd device whole lifecycle,
there would be some problem in networking or cluster recovery to make
the request timeout. This would make the fs readonly and application down.

[Question 3]: How can this patch solve this problems?
With this patch, we can map rbd device with default value of osd_reques_timeout,
means never timeout, then we can solve the problem mentioned Question 2.

At the same time we can set the osd_request_timeout to what we need,
in system shuting down, for example, we can do this in rbdmap.service.
then we can make sure we can shutdown or reboot host normally no matter
ceph cluster is working well or not. This can solve the problem mentioned
in Question 1.

Signed-off-by: Dongsheng Yang <dongsheng.yang@xxxxxxxxxxxx>
---
  include/linux/ceph/libceph.h |  1 +
  net/ceph/debugfs.c           | 31 +++++++++++++++++++++++++++++++
  2 files changed, 32 insertions(+)

diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h
index 9ce689a..8fdea50 100644
--- a/include/linux/ceph/libceph.h
+++ b/include/linux/ceph/libceph.h
@@ -133,6 +133,7 @@ struct ceph_client {
  	struct dentry *debugfs_monmap;
  	struct dentry *debugfs_osdmap;
  	struct dentry *debugfs_options;
+	struct dentry *debugfs_osd_req_timeout;
  #endif
  };
diff --git a/net/ceph/debugfs.c b/net/ceph/debugfs.c
index 46f6570..2b6cae3 100644
--- a/net/ceph/debugfs.c
+++ b/net/ceph/debugfs.c
@@ -389,6 +389,28 @@ static int client_options_show(struct seq_file *s, void *p)
  CEPH_DEFINE_SHOW_FUNC(osdc_show)
  CEPH_DEFINE_SHOW_FUNC(client_options_show)
+static int osd_request_timeout_get(void *data, u64 *val)
+{
+	struct ceph_client *client = data;
+
+	*val = client->options->osd_request_timeout;
+	return 0;
+}
+
+static int osd_request_timeout_set(void *data, u64 val)
+{
+	struct ceph_client *client = data;
+
+	client->options->osd_request_timeout = val;
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(client_osd_req_timeout_fops,
+			osd_request_timeout_get,
+			osd_request_timeout_set,
+			"%lld\n");
+
+
  int __init ceph_debugfs_init(void)
  {
  	ceph_debugfs_dir = debugfs_create_dir("ceph", NULL);
@@ -457,6 +479,14 @@ int ceph_debugfs_client_init(struct ceph_client *client)
  	if (!client->debugfs_options)
  		goto out;
+ client->debugfs_osd_req_timeout = debugfs_create_file("osd_request_timeout",
+					  0600,
+					  client->debugfs_dir,
+					  client,
+					  &client_osd_req_timeout_fops);
+	if (!client->debugfs_osd_req_timeout)
+		goto out;
+
  	return 0;
out:
@@ -467,6 +497,7 @@ int ceph_debugfs_client_init(struct ceph_client *client)
  void ceph_debugfs_client_cleanup(struct ceph_client *client)
  {
  	dout("ceph_debugfs_client_cleanup %p\n", client);
+	debugfs_remove(client->debugfs_osd_req_timeout);
  	debugfs_remove(client->debugfs_options);
  	debugfs_remove(client->debugfs_osdmap);
  	debugfs_remove(client->debugfs_monmap);


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux