On 1/16/25 11:31 AM, Olga Kornievskaia wrote:
On Thu, Jan 16, 2025 at 11:00 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
On 1/16/25 10:42 AM, Jeff Layton wrote:
On Thu, 2025-01-16 at 10:33 -0500, Chuck Lever wrote:
On 1/16/25 9:54 AM, Olga Kornievskaia wrote:
On Thu, Jan 16, 2025 at 9:27 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
On 1/15/25 6:24 PM, Olga Kornievskaia wrote:
nfsd stores its network transports in a lwq (which is a lockless list)
llist has no ability to remove a particular entry which nfsd needs
to remove a listener thread.
Note that scripts/get_maintainer.pl says that the maintainer of this
file is:
linux-kernel@xxxxxxxxxxxxxxx
so that address should appear on the cc: list of this series. So
should the list of reviewers for NFSD that appear in MAINTAINERS.
I will resend and include the mentioned list.
ISTR Neil found the same gap in the llist API. I don't think it's
possible to safely remove an item from the middle of an llist. Much
of the complexity of the current svc thread scheduler is due to this
complication.
I think you will need to mark the listener as dead and find some
way of safely dealing with it later.
A listener can only be removed if there are no active threads. This
code in nfsd_nl_listener_set_doit() won't allow it which happens
before we are manipulating the listener:
/* For now, no removing old sockets while server is running */
if (serv->sv_nrthreads && !list_empty(&permsocks)) {
list_splice_init(&permsocks, &serv->sv_permsocks);
spin_unlock_bh(&serv->sv_lock);
err = -EBUSY;
goto out_unlock_mtx;
}
Got it.
I'm splitting hairs, but this seems like a special case that might be
true only for NFSD and could be abused by other API consumers.
At the least, the opening block comment in llist.h should be updated
to include del_entry in the locking table.
I would be more comfortable with a solution that works in alignment with
the current llist API, but if others are fine with this solution, then I
won't object strenuously.
(And to be clear, I agree that there is a bug in set_doit() that needs
to be addressed quickly -- it's the specific fix that I'm queasy about).
Yeah, it would be good to address it quickly since you can crash the
box with this right now.
I'm asking myself why this isn't a problem during server shutdown or
when removing just one listener -- with rpc.nfsd. Have we never done
this before we had netlink?
Removing a single listener was never been an option before. Shutdown
removes listeners and then frees the net, serv. proc fs only allowed
to view listeners. To change them, you had to change nfs.conf and
restart the server.
The problem here is new code that didn't handle a single entry removal
correctly.
I'm not thrilled with adding the llist_del_entry() footgun either, but
it should work.
I would like to see one or two alternatives before sticking with this
llist_del_entry() idea.
Another approach we could consider instead would be to delay queueing
all of these sockets to the lwq until after the sv_permsocks list is
settled. You could even just queue up the whole sv_permsocks list at
the end of this. If there's no work there, then there's no real harm.
That is a bit more surgery, however, since you'd have to lift
svc_xprt_received() handling out of svc_xprt_create_from_sa().
Handling the list once instead of adding and/or removing one at a time
seems like a better plan to me (IIUC).
I'll attempt an alternative that creates anew.
Thanks! I appreciate your work on this.
I don't understand this
suggestion to "wait for sv_permsock to be settled". How can you know
when nfsdctl is done adding/removing listeners?
Also, nit: the use of the term "sockets" throughout this code is
confusing. We're dealing with RDMA endpoints as well in here. We can't
easily rename the structure fields, true, but the comments could be more
helpful.
Suggested-by: Jeff Layton <jlayton@xxxxxxxxxx>
Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx>
---
include/linux/llist.h | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/include/linux/llist.h b/include/linux/llist.h
index 2c982ff7475a..fe6be21897d9 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -253,6 +253,42 @@ static inline bool __llist_add(struct llist_node *new, struct llist_head *head)
return __llist_add_batch(new, new, head);
}
+/**
+ * llist_del_entry - remove a particular entry from a lock-less list
+ * @head: head of the list to remove the entry from
+ * @entry: entry to be removed from the list
+ *
+ * Walk the list, find the given entry and remove it from the list.
The above sentence repeats the comments in the code and the code itself.
It visually obscures the next, much more important, sentence.
+ * The caller must ensure that nothing can race in and change the
+ * list while this is running.
+ *
+ * Returns true if the entry was found and removed.
+ */
+static inline bool llist_del_entry(struct llist_head *head, struct llist_node *entry)
+{
+ struct llist_node *pos;
+
+ if (!head->first)
+ return false;
if (llist_empty()) ?
+
+ /* Is it the first entry? */
+ if (head->first == entry) {
+ head->first = entry->next;
+ entry->next = entry;
+ return true;
llist_del_first() or even llist_del_first_this()
Basically I would avoid open-coding this logic and use the existing
helpers where possible. The new code doesn't provide memory release
semantics that would ensure the next access of the llist sees these
updates.
+ }
+
+ /* Find it in the list */
+ llist_for_each(head->first, pos) {
+ if (pos->next == entry) {
+ pos->next = entry->next;
+ entry->next = entry;
+ return true;
+ }
+ }
+ return false;
+}
+
/**
* llist_del_all - delete all entries from lock-less list
* @head: the head of lock-less list to delete all entries
--
Chuck Lever
--
Chuck Lever
--
Chuck Lever