Re: [RFC] netlink broadcast return value

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick McHardy wrote:
> Pablo Neira Ayuso wrote:
>> And again, you point that this should be per-socket, but how can you
>> make this option per-socket? The only way that I see to make
>> state-change reporting reliable is to drop the packet to force the peer
>> to retransmit the packet and trigger the same state-change, and that
>> affect all ctnetlink listeners.
> 
> For unicast its obviously simple, for broadcast you'd need something
> like this:
> 
> err = 0;
> for (all netlink sockets; sk && !err; ...) {
>     skb = skb_clone(...)
>     if (skb == NULL) {
>         if (sk->flags & NETLINK_HIGHLY_RELIABLE)
>             err = -ENOBUFS;
>         continue;
>     }
>     ...
> }
> 
> So you're returning an error when at least one of the "reliable"
> sockets doesn't get its delivery.

Patrick, I like it, I'm fine with this approach as soon as it let me add
the "reliable" ctnetlink state-change reporting. I can add the following
on top of the patch that David already applied:

--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
[...]
@@ -999,6 +1000,7 @@ static inline int do_one_broadcast(struct sock *sk,
 		p->skb2 = NULL;
 	} else if ((val = netlink_broadcast_deliver(sk, p->skb2)) < 0) {
 		netlink_overrun(sk);
+		p->delivery_failure = 1;
                ^^^^^^^^^^^^^^^^^^^^^^^^
Replace this by:
+		if (nlk->flags & NETLINK_HIGHLY_RELIABLE)
+			p->delivery_failure = 1;

And include the flag definition and setsockopt() operations in the new
patch, of course.

Please, find the previous patch that was applied to net-next tree
enclosed to save you some time in case that you don't know what patch I
was refering to. I think that the changes (several drivers and such) are
still useful, as they should ignore the return value of
netlink_broadcast() since it's not of any use for them (as we already
discussed, they printk the error, that's useless).

-- 
"Los honestos son inadaptados sociales" -- Les Luthiers
netlink: change return-value logic of netlink_broadcast()

From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>

Currently, netlink_broadcast() reports errors to the caller if no
messages at all were delivered:

1) If, at least, one message has been delivered correctly, returns 0.
2) Otherwise, if no messages at all were delivered due to skb_clone()
   failure, return -ENOBUFS.
3) Otherwise, if there are no listeners, return -ESRCH.

With this patch, the caller knows if the delivery of any of the
messages to the listeners have failed:

1) If it fails to deliver any message (for whatever reason), return
   -ENOBUFS.
2) Otherwise, if all messages were delivered OK, returns 0.
3) Otherwise, if no listeners, return -ESRCH.

In the current ctnetlink code and in Netfilter in general, we can add
reliable logging and connection tracking event delivery by dropping the
packets whose events were not successfully delivered over Netlink. Of
course, this option would be settable via /proc as this approach reduces
performance (in terms of filtered connections per seconds by a stateful
firewall) but providing reliable logging and event delivery (for
conntrackd) in return.

This patch also changes some clients of netlink_broadcast() that
may report ENOBUFS errors via printk. This error handling is not
of any help. Instead, the userspace daemons that are listening to
those netlink messages should resync themselves with the kernel-side
if they hit ENOBUFS.

BTW, netlink_broadcast() clients include those that call
cn_netlink_send(), nlmsg_multicast() and genlmsg_multicast() since they
internally call netlink_broadcast() and return its error value.

Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
---

 drivers/acpi/event.c                |    6 +-----
 drivers/scsi/scsi_transport_fc.c    |   16 ++++------------
 drivers/scsi/scsi_transport_iscsi.c |   12 ++----------
 drivers/video/uvesafb.c             |    5 ++++-
 fs/dquot.c                          |    5 +----
 lib/kobject_uevent.c                |    3 +++
 net/netlink/af_netlink.c            |    8 ++++++--
 net/wimax/op-msg.c                  |    9 +++------
 net/wimax/stack.c                   |   12 ++++--------
 9 files changed, 28 insertions(+), 48 deletions(-)


diff --git a/drivers/acpi/event.c b/drivers/acpi/event.c
index 0c24bd4..aeb7e5f 100644
--- a/drivers/acpi/event.c
+++ b/drivers/acpi/event.c
@@ -235,11 +235,7 @@ int acpi_bus_generate_netlink_event(const char *device_class,
 		return result;
 	}
 
-	result =
-	    genlmsg_multicast(skb, 0, acpi_event_mcgrp.id, GFP_ATOMIC);
-	if (result)
-		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-				  "Failed to send a Genetlink message!\n"));
+	genlmsg_multicast(skb, 0, acpi_event_mcgrp.id, GFP_ATOMIC);
 	return 0;
 }
 
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 5f77417..3ee4eb4 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -533,12 +533,8 @@ fc_host_post_event(struct Scsi_Host *shost, u32 event_number,
 	event->event_code = event_code;
 	event->event_data = event_data;
 
-	err = nlmsg_multicast(scsi_nl_sock, skb, 0, SCSI_NL_GRP_FC_EVENTS,
-			      GFP_KERNEL);
-	if (err && (err != -ESRCH))	/* filter no recipient errors */
-		/* nlmsg_multicast already kfree_skb'd */
-		goto send_fail;
-
+	nlmsg_multicast(scsi_nl_sock, skb, 0, SCSI_NL_GRP_FC_EVENTS,
+			GFP_KERNEL);
 	return;
 
 send_fail_skb:
@@ -607,12 +603,8 @@ fc_host_post_vendor_event(struct Scsi_Host *shost, u32 event_number,
 	event->event_code = FCH_EVT_VENDOR_UNIQUE;
 	memcpy(&event->event_data, data_buf, data_len);
 
-	err = nlmsg_multicast(scsi_nl_sock, skb, 0, SCSI_NL_GRP_FC_EVENTS,
-			      GFP_KERNEL);
-	if (err && (err != -ESRCH))	/* filter no recipient errors */
-		/* nlmsg_multicast already kfree_skb'd */
-		goto send_vendor_fail;
-
+	nlmsg_multicast(scsi_nl_sock, skb, 0, SCSI_NL_GRP_FC_EVENTS,
+			GFP_KERNEL);
 	return;
 
 send_vendor_fail_skb:
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 75c9297..2adfab8 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -966,15 +966,7 @@ iscsi_if_transport_lookup(struct iscsi_transport *tt)
 static int
 iscsi_broadcast_skb(struct sk_buff *skb, gfp_t gfp)
 {
-	int rc;
-
-	rc = netlink_broadcast(nls, skb, 0, 1, gfp);
-	if (rc < 0) {
-		printk(KERN_ERR "iscsi: can not broadcast skb (%d)\n", rc);
-		return rc;
-	}
-
-	return 0;
+	return netlink_broadcast(nls, skb, 0, 1, gfp);
 }
 
 static int
@@ -1207,7 +1199,7 @@ int iscsi_session_event(struct iscsi_cls_session *session,
 	 * the user and when the daemon is restarted it will handle it
 	 */
 	rc = iscsi_broadcast_skb(skb, GFP_KERNEL);
-	if (rc < 0)
+	if (rc == -ESRCH)
 		iscsi_cls_session_printk(KERN_ERR, session,
 					 "Cannot notify userspace of session "
 					 "event %u. Check iscsi daemon\n",
diff --git a/drivers/video/uvesafb.c b/drivers/video/uvesafb.c
index 6c2d37f..74ae758 100644
--- a/drivers/video/uvesafb.c
+++ b/drivers/video/uvesafb.c
@@ -204,8 +204,11 @@ static int uvesafb_exec(struct uvesafb_ktask *task)
 		} else {
 			v86d_started = 1;
 			err = cn_netlink_send(m, 0, gfp_any());
+			if (err == -ENOBUFS)
+				err = 0;
 		}
-	}
+	} else if (err == -ENOBUFS)
+		err = 0;
 
 	if (!err && !(task->t.flags & TF_EXIT))
 		err = !wait_for_completion_timeout(task->done,
diff --git a/fs/dquot.c b/fs/dquot.c
index bca3cac..d6add0b 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -1057,10 +1057,7 @@ static void send_warning(const struct dquot *dquot, const char warntype)
 		goto attr_err_out;
 	genlmsg_end(skb, msg_head);
 
-	ret = genlmsg_multicast(skb, 0, quota_genl_family.id, GFP_NOFS);
-	if (ret < 0 && ret != -ESRCH)
-		printk(KERN_ERR
-			"VFS: Failed to send notification message: %d\n", ret);
+	genlmsg_multicast(skb, 0, quota_genl_family.id, GFP_NOFS);
 	return;
 attr_err_out:
 	printk(KERN_ERR "VFS: Not enough space to compose quota message!\n");
diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index 318328d..3813102 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -227,6 +227,9 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 			NETLINK_CB(skb).dst_group = 1;
 			retval = netlink_broadcast(uevent_sock, skb, 0, 1,
 						   GFP_KERNEL);
+			/* ENOBUFS should be handled in userspace */
+			if (retval == -ENOBUFS)
+				retval = 0;
 		} else
 			retval = -ENOMEM;
 	}
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 9eb895c..6ee69c2 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -950,6 +950,7 @@ struct netlink_broadcast_data {
 	u32 pid;
 	u32 group;
 	int failure;
+	int delivery_failure;
 	int congested;
 	int delivered;
 	gfp_t allocation;
@@ -999,6 +1000,7 @@ static inline int do_one_broadcast(struct sock *sk,
 		p->skb2 = NULL;
 	} else if ((val = netlink_broadcast_deliver(sk, p->skb2)) < 0) {
 		netlink_overrun(sk);
+		p->delivery_failure = 1;
 	} else {
 		p->congested |= val;
 		p->delivered = 1;
@@ -1025,6 +1027,7 @@ int netlink_broadcast(struct sock *ssk, struct sk_buff *skb, u32 pid,
 	info.pid = pid;
 	info.group = group;
 	info.failure = 0;
+	info.delivery_failure = 0;
 	info.congested = 0;
 	info.delivered = 0;
 	info.allocation = allocation;
@@ -1045,13 +1048,14 @@ int netlink_broadcast(struct sock *ssk, struct sk_buff *skb, u32 pid,
 	if (info.skb2)
 		kfree_skb(info.skb2);
 
+	if (info.delivery_failure || info.failure)
+		return -ENOBUFS;
+
 	if (info.delivered) {
 		if (info.congested && (allocation & __GFP_WAIT))
 			yield();
 		return 0;
 	}
-	if (info.failure)
-		return -ENOBUFS;
 	return -ESRCH;
 }
 EXPORT_SYMBOL(netlink_broadcast);
diff --git a/net/wimax/op-msg.c b/net/wimax/op-msg.c
index cb3b4ad..5d149c1 100644
--- a/net/wimax/op-msg.c
+++ b/net/wimax/op-msg.c
@@ -258,7 +258,6 @@ EXPORT_SYMBOL_GPL(wimax_msg_len);
  */
 int wimax_msg_send(struct wimax_dev *wimax_dev, struct sk_buff *skb)
 {
-	int result;
 	struct device *dev = wimax_dev->net_dev->dev.parent;
 	void *msg = skb->data;
 	size_t size = skb->len;
@@ -266,11 +265,9 @@ int wimax_msg_send(struct wimax_dev *wimax_dev, struct sk_buff *skb)
 
 	d_printf(1, dev, "CTX: wimax msg, %zu bytes\n", size);
 	d_dump(2, dev, msg, size);
-	result = genlmsg_multicast(skb, 0, wimax_gnl_mcg.id, GFP_KERNEL);
-	d_printf(1, dev, "CTX: genl multicast result %d\n", result);
-	if (result == -ESRCH)	/* Nobody connected, ignore it */
-		result = 0;	/* btw, the skb is freed already */
-	return result;
+	genlmsg_multicast(skb, 0, wimax_gnl_mcg.id, GFP_KERNEL);
+	d_printf(1, dev, "CTX: genl multicast done\n");
+	return 0;
 }
 EXPORT_SYMBOL_GPL(wimax_msg_send);
 
diff --git a/net/wimax/stack.c b/net/wimax/stack.c
index 3869c03..a0ee76b 100644
--- a/net/wimax/stack.c
+++ b/net/wimax/stack.c
@@ -163,16 +163,12 @@ int wimax_gnl_re_state_change_send(
 	struct device *dev = wimax_dev_to_dev(wimax_dev);
 	d_fnstart(3, dev, "(wimax_dev %p report_skb %p)\n",
 		  wimax_dev, report_skb);
-	if (report_skb == NULL)
+	if (report_skb == NULL) {
+		result = -ENOMEM;
 		goto out;
-	genlmsg_end(report_skb, header);
-	result = genlmsg_multicast(report_skb, 0, wimax_gnl_mcg.id, GFP_KERNEL);
-	if (result == -ESRCH)	/* Nobody connected, ignore it */
-		result = 0;	/* btw, the skb is freed already */
-	if (result < 0) {
-		dev_err(dev, "RE_STCH: Error sending: %d\n", result);
-		nlmsg_free(report_skb);
 	}
+	genlmsg_end(report_skb, header);
+	genlmsg_multicast(report_skb, 0, wimax_gnl_mcg.id, GFP_KERNEL);
 out:
 	d_fnend(3, dev, "(wimax_dev %p report_skb %p) = %d\n",
 		wimax_dev, report_skb, result);

[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux