resending to list ---------- Forwarded message --------- From: Steve French <smfrench@xxxxxxxxx> Date: Mon, Nov 6, 2023 at 1:13 PM Subject: Re: [PATCH 09/14] cifs: add a back pointer to cifs_sb from tcon To: Shyam Prasad N <nspmangalore@xxxxxxxxx> Cc: Paulo Alcantara <pc@xxxxxxxxxxxxx>, <bharathsm.hsk@xxxxxxxxx>, <linux-cifs@xxxxxxxxxxxxxxx>, Shyam Prasad N <sprasad@xxxxxxxxxxxxx> Updated patches again to address compile/sparse warning and pushed to for-next pending more testing and review Attached are updated patches 1 and 2 On Mon, Nov 6, 2023 at 12:45 PM Steve French <smfrench@xxxxxxxxx> wrote: > > lightly updated patch 1 and patch 2 in your series to fix checkpatch warnings > > Was thinking about patch 2 though and whether it could hurt cases where there is little parallelism - we could randomly pick channels to send things on so we could overuse channels with higher latency rather than preferring the first two channels which are the "faster" ones (and presumably lower latencies). Was wondering if we end up picking e.g. 4 channels, 1 of which is "fast" and 3 are slower - we could end up with cases where no traffic on channel 1 but we end up sending on a longer latency channlel - ie the round robin approach could end up using channel 2 or 3 or 4 which have longer latencies even if no traffic on the "fastest" channel ... > > > On Mon, Nov 6, 2023 at 11:04 AM Shyam Prasad N <nspmangalore@xxxxxxxxx> wrote: >> >> On Mon, Nov 6, 2023 at 9:42 PM Shyam Prasad N <nspmangalore@xxxxxxxxx> wrote: >> > >> > On Sat, Nov 4, 2023 at 2:33 AM Paulo Alcantara <pc@xxxxxxxxxxxxx> wrote: >> > > >> > > nspmangalore@xxxxxxxxx writes: >> > > >> > > > From: Shyam Prasad N <sprasad@xxxxxxxxxxxxx> >> > > > >> > > > Today, we have no way to access the cifs_sb when we >> > > > just have pointers to struct tcon. This is very >> > > > limiting as many functions deal with cifs_sb, and >> > > > these calls do not directly originate from VFS. >> > > > >> > > > This change introduces a new cifs_sb field in cifs_tcon >> > > > that points to the cifs_sb for the tcon. The assumption >> > > > here is that a tcon will always map to this cifs_sb and >> > > > will never change. >> > > > >> > > > Also, refcounting should not be necessary, since cifs_sb >> > > > will never be freed before tcon. >> > > > >> > > > Signed-off-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx> >> > > > --- >> > > > fs/smb/client/cifsglob.h | 1 + >> > > > fs/smb/client/connect.c | 2 ++ >> > > > 2 files changed, 3 insertions(+) >> > > >> > > This is wrong as a single tcon may be shared among different >> > > superblocks. You can, however, map those superblocks to a tcon by using >> > > the cifs_sb_master_tcon() helper. >> > > >> > > If you do something like this >> > > >> > > mount.cifs //srv/share /mnt/1 -o ... >> > > mount.cifs //srv/share /mnt/1 -o ... -> -EBUSY >> > > >> > > tcon->cifs_sb will end up with the already freed superblock pointer that >> > > was compared to the existing one. So, you'll get an use-after-free when >> > > you dereference tcon->cifs_sb as in patch 11/14. >> > >> > Hi Steve, >> > I discussed this one with Paulo separately. You can drop this patch. >> > I'll have another patch in place of this one. And then send you an >> > updated patch for the patch which depends on it. >> > >> > -- >> > Regards, >> > Shyam >> >> Here's the replacement patch for "cifs: add a back pointer to cifs_sb from tcon" >> >> -- >> Regards, >> Shyam > > > > -- > Thanks, > > Steve -- Thanks, Steve -- Thanks, Steve
From b10f9cd83e93d42f44b455ed567af0941207b80a Mon Sep 17 00:00:00 2001 From: Shyam Prasad N <sprasad@microsoft.com> Date: Mon, 26 Dec 2022 11:24:56 +0000 Subject: [PATCH 2/7] cifs: distribute channels across interfaces based on speed Today, if the server interfaces RSS capable, we simply choose the fastest interface to setup a channel. This is not a scalable approach, and does not make a lot of attempt to distribute the connections. This change does a weighted distribution of channels across all the available server interfaces, where the weight is a function of the advertised interface speed. Also make sure that we don't mix rdma and non-rdma for channels. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> --- fs/smb/client/cifs_debug.c | 16 ++++++++ fs/smb/client/cifsglob.h | 2 + fs/smb/client/sess.c | 84 +++++++++++++++++++++++++++++++------- 3 files changed, 88 insertions(+), 14 deletions(-) diff --git a/fs/smb/client/cifs_debug.c b/fs/smb/client/cifs_debug.c index 4a1b9bf3a574..132ba277438f 100644 --- a/fs/smb/client/cifs_debug.c +++ b/fs/smb/client/cifs_debug.c @@ -284,6 +284,8 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v) struct cifs_ses *ses; struct cifs_tcon *tcon; struct cifs_server_iface *iface; + size_t iface_weight = 0, iface_min_speed = 0; + struct cifs_server_iface *last_iface = NULL; int c, i, j; seq_puts(m, @@ -549,11 +551,25 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v) "\tLast updated: %lu seconds ago", ses->iface_count, (jiffies - ses->iface_last_update) / HZ); + + last_iface = list_last_entry(&ses->iface_list, + struct cifs_server_iface, + iface_head); + iface_min_speed = last_iface->speed; + j = 0; list_for_each_entry(iface, &ses->iface_list, iface_head) { seq_printf(m, "\n\t%d)", ++j); cifs_dump_iface(m, iface); + + iface_weight = iface->speed / iface_min_speed; + seq_printf(m, "\t\tWeight (cur,total): (%zu,%zu)" + "\n\t\tAllocated channels: %u\n", + iface->weight_fulfilled, + iface_weight, + iface->num_channels); + if (is_ses_using_iface(ses, iface)) seq_puts(m, "\t\t[CONNECTED]\n"); } diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h index 552ed441281a..81e7a45f413d 100644 --- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -969,6 +969,8 @@ struct cifs_server_iface { struct list_head iface_head; struct kref refcount; size_t speed; + size_t weight_fulfilled; + unsigned int num_channels; unsigned int rdma_capable : 1; unsigned int rss_capable : 1; unsigned int is_active : 1; /* unset if non existent */ diff --git a/fs/smb/client/sess.c b/fs/smb/client/sess.c index 5d65243b0178..04d300e9a779 100644 --- a/fs/smb/client/sess.c +++ b/fs/smb/client/sess.c @@ -179,7 +179,9 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) int left; int rc = 0; int tries = 0; + size_t iface_weight = 0, iface_min_speed = 0; struct cifs_server_iface *iface = NULL, *niface = NULL; + struct cifs_server_iface *last_iface = NULL; spin_lock(&ses->chan_lock); @@ -207,21 +209,11 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) } spin_unlock(&ses->chan_lock); - /* - * Keep connecting to same, fastest, iface for all channels as - * long as its RSS. Try next fastest one if not RSS or channel - * creation fails. - */ - spin_lock(&ses->iface_lock); - iface = list_first_entry(&ses->iface_list, struct cifs_server_iface, - iface_head); - spin_unlock(&ses->iface_lock); - while (left > 0) { tries++; if (tries > 3*ses->chan_max) { - cifs_dbg(FYI, "too many channel open attempts (%d channels left to open)\n", + cifs_dbg(VFS, "too many channel open attempts (%d channels left to open)\n", left); break; } @@ -229,17 +221,35 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) spin_lock(&ses->iface_lock); if (!ses->iface_count) { spin_unlock(&ses->iface_lock); + cifs_dbg(VFS, "server %s does not advertise interfaces\n", + ses->server->hostname); break; } + if (!iface) + iface = list_first_entry(&ses->iface_list, struct cifs_server_iface, + iface_head); + last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface, + iface_head); + iface_min_speed = last_iface->speed; + list_for_each_entry_safe_from(iface, niface, &ses->iface_list, iface_head) { + /* do not mix rdma and non-rdma interfaces */ + if (iface->rdma_capable != ses->server->rdma) + continue; + /* skip ifaces that are unusable */ if (!iface->is_active || (is_ses_using_iface(ses, iface) && - !iface->rss_capable)) { + !iface->rss_capable)) + continue; + + /* check if we already allocated enough channels */ + iface_weight = iface->speed / iface_min_speed; + + if (iface->weight_fulfilled >= iface_weight) continue; - } /* take ref before unlock */ kref_get(&iface->refcount); @@ -256,10 +266,21 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) continue; } - cifs_dbg(FYI, "successfully opened new channel on iface:%pIS\n", + iface->num_channels++; + iface->weight_fulfilled++; + cifs_dbg(VFS, "successfully opened new channel on iface:%pIS\n", &iface->sockaddr); break; } + + /* reached end of list. reset weight_fulfilled and start over */ + if (list_entry_is_head(iface, &ses->iface_list, iface_head)) { + list_for_each_entry(iface, &ses->iface_list, iface_head) + iface->weight_fulfilled = 0; + spin_unlock(&ses->iface_lock); + iface = NULL; + continue; + } spin_unlock(&ses->iface_lock); left--; @@ -278,8 +299,10 @@ int cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index; + size_t iface_weight = 0, iface_min_speed = 0; struct cifs_server_iface *iface = NULL; struct cifs_server_iface *old_iface = NULL; + struct cifs_server_iface *last_iface = NULL; int rc = 0; spin_lock(&ses->chan_lock); @@ -299,13 +322,34 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) spin_unlock(&ses->chan_lock); spin_lock(&ses->iface_lock); + if (!ses->iface_count) { + spin_unlock(&ses->iface_lock); + cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname); + return 0; + } + + last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface, + iface_head); + iface_min_speed = last_iface->speed; + /* then look for a new one */ list_for_each_entry(iface, &ses->iface_list, iface_head) { + /* do not mix rdma and non-rdma interfaces */ + if (iface->rdma_capable != server->rdma) + continue; + if (!iface->is_active || (is_ses_using_iface(ses, iface) && !iface->rss_capable)) { continue; } + + /* check if we already allocated enough channels */ + iface_weight = iface->speed / iface_min_speed; + + if (iface->weight_fulfilled >= iface_weight) + continue; + kref_get(&iface->refcount); break; } @@ -321,10 +365,22 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) cifs_dbg(FYI, "replacing iface: %pIS with %pIS\n", &old_iface->sockaddr, &iface->sockaddr); + + old_iface->num_channels--; + if (old_iface->weight_fulfilled) + old_iface->weight_fulfilled--; + iface->num_channels++; + iface->weight_fulfilled++; + kref_put(&old_iface->refcount, release_iface); } else if (old_iface) { cifs_dbg(FYI, "releasing ref to iface: %pIS\n", &old_iface->sockaddr); + + old_iface->num_channels--; + if (old_iface->weight_fulfilled) + old_iface->weight_fulfilled--; + kref_put(&old_iface->refcount, release_iface); } else { WARN_ON(!iface); -- 2.39.2
From 959c5b00670d2ddef91748f2deba215f965546a4 Mon Sep 17 00:00:00 2001 From: Shyam Prasad N <sprasad@microsoft.com> Date: Fri, 13 Oct 2023 09:25:30 +0000 Subject: [PATCH 1/7] cifs: handle cases where a channel is closed So far, SMB multichannel could only scale up, but not scale down the number of channels. In this series of patch, we now allow the client to deal with the case of multichannel disabled on the server when the share is mounted. With that change, we now need the ability to scale down the channels. This change allows the client to deal with cases of missing channels more gracefully. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> --- fs/smb/client/cifs_debug.c | 5 +++++ fs/smb/client/cifsglob.h | 1 + fs/smb/client/cifsproto.h | 2 +- fs/smb/client/connect.c | 6 +++++- fs/smb/client/sess.c | 28 ++++++++++++++++++++++++---- fs/smb/client/smb2transport.c | 8 +++++++- 6 files changed, 43 insertions(+), 7 deletions(-) diff --git a/fs/smb/client/cifs_debug.c b/fs/smb/client/cifs_debug.c index 6d8804fb6535..4a1b9bf3a574 100644 --- a/fs/smb/client/cifs_debug.c +++ b/fs/smb/client/cifs_debug.c @@ -136,6 +136,11 @@ cifs_dump_channel(struct seq_file *m, int i, struct cifs_chan *chan) { struct TCP_Server_Info *server = chan->server; + if (!server) { + seq_printf(m, "\n\n\t\tChannel: %d DISABLED", i+1); + return; + } + seq_printf(m, "\n\n\t\tChannel: %d ConnectionId: 0x%llx" "\n\t\tNumber of credits: %d,%d,%d Dialect 0x%x" "\n\t\tTCP status: %d Instance: %d" diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h index 02082621d8e0..552ed441281a 100644 --- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -1050,6 +1050,7 @@ struct cifs_ses { spinlock_t chan_lock; /* ========= begin: protected by chan_lock ======== */ #define CIFS_MAX_CHANNELS 16 +#define CIFS_INVAL_CHAN_INDEX (-1) #define CIFS_ALL_CHANNELS_SET(ses) \ ((1UL << (ses)->chan_count) - 1) #define CIFS_ALL_CHANS_GOOD(ses) \ diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h index 890ceddae07e..c1f71c6be7e3 100644 --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -616,7 +616,7 @@ bool is_server_using_iface(struct TCP_Server_Info *server, bool is_ses_using_iface(struct cifs_ses *ses, struct cifs_server_iface *iface); void cifs_ses_mark_for_reconnect(struct cifs_ses *ses); -unsigned int +int cifs_ses_get_chan_index(struct cifs_ses *ses, struct TCP_Server_Info *server); void diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c index 1a137b33858a..3ff82f0aa00e 100644 --- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -173,8 +173,12 @@ cifs_signal_cifsd_for_reconnect(struct TCP_Server_Info *server, list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) { spin_lock(&ses->chan_lock); for (i = 0; i < ses->chan_count; i++) { + if (!ses->chans[i].server) + continue; + spin_lock(&ses->chans[i].server->srv_lock); - ses->chans[i].server->tcpStatus = CifsNeedReconnect; + if (ses->chans[i].server->tcpStatus != CifsExiting) + ses->chans[i].server->tcpStatus = CifsNeedReconnect; spin_unlock(&ses->chans[i].server->srv_lock); } spin_unlock(&ses->chan_lock); diff --git a/fs/smb/client/sess.c b/fs/smb/client/sess.c index cd474cf98f30..5d65243b0178 100644 --- a/fs/smb/client/sess.c +++ b/fs/smb/client/sess.c @@ -69,7 +69,7 @@ bool is_ses_using_iface(struct cifs_ses *ses, struct cifs_server_iface *iface) /* channel helper functions. assumed that chan_lock is held by caller. */ -unsigned int +int cifs_ses_get_chan_index(struct cifs_ses *ses, struct TCP_Server_Info *server) { @@ -85,14 +85,17 @@ cifs_ses_get_chan_index(struct cifs_ses *ses, cifs_dbg(VFS, "unable to get chan index for server: 0x%llx", server->conn_id); WARN_ON(1); - return 0; + return CIFS_INVAL_CHAN_INDEX; } void cifs_chan_set_in_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server) { - unsigned int chan_index = cifs_ses_get_chan_index(ses, server); + int chan_index = cifs_ses_get_chan_index(ses, server); + + if (chan_index == CIFS_INVAL_CHAN_INDEX) + return; ses->chans[chan_index].in_reconnect = true; } @@ -102,6 +105,8 @@ cifs_chan_clear_in_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index = cifs_ses_get_chan_index(ses, server); + if (chan_index == CIFS_INVAL_CHAN_INDEX) + return; ses->chans[chan_index].in_reconnect = false; } @@ -111,6 +116,8 @@ cifs_chan_in_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index = cifs_ses_get_chan_index(ses, server); + if (chan_index == CIFS_INVAL_CHAN_INDEX) + return true; /* err on the safer side */ return CIFS_CHAN_IN_RECONNECT(ses, chan_index); } @@ -120,6 +127,8 @@ cifs_chan_set_need_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index = cifs_ses_get_chan_index(ses, server); + if (chan_index == CIFS_INVAL_CHAN_INDEX) + return; set_bit(chan_index, &ses->chans_need_reconnect); cifs_dbg(FYI, "Set reconnect bitmask for chan %u; now 0x%lx\n", @@ -131,6 +140,8 @@ cifs_chan_clear_need_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index = cifs_ses_get_chan_index(ses, server); + if (chan_index == CIFS_INVAL_CHAN_INDEX) + return; clear_bit(chan_index, &ses->chans_need_reconnect); cifs_dbg(FYI, "Cleared reconnect bitmask for chan %u; now 0x%lx\n", @@ -142,6 +153,8 @@ cifs_chan_needs_reconnect(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index = cifs_ses_get_chan_index(ses, server); + if (chan_index == CIFS_INVAL_CHAN_INDEX) + return true; /* err on the safer side */ return CIFS_CHAN_NEEDS_RECONNECT(ses, chan_index); } @@ -151,6 +164,8 @@ cifs_chan_is_iface_active(struct cifs_ses *ses, struct TCP_Server_Info *server) { unsigned int chan_index = cifs_ses_get_chan_index(ses, server); + if (chan_index == CIFS_INVAL_CHAN_INDEX) + return true; /* err on the safer side */ return ses->chans[chan_index].iface && ses->chans[chan_index].iface->is_active; @@ -269,7 +284,7 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) spin_lock(&ses->chan_lock); chan_index = cifs_ses_get_chan_index(ses, server); - if (!chan_index) { + if (chan_index == CIFS_INVAL_CHAN_INDEX) { spin_unlock(&ses->chan_lock); return 0; } @@ -319,6 +334,11 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server) spin_lock(&ses->chan_lock); chan_index = cifs_ses_get_chan_index(ses, server); + if (chan_index == CIFS_INVAL_CHAN_INDEX) { + spin_unlock(&ses->chan_lock); + return 0; + } + ses->chans[chan_index].iface = iface; /* No iface is found. if secondary chan, drop connection */ diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c index 23c50ed7d4b5..84ea67301303 100644 --- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -413,7 +413,13 @@ generate_smb3signingkey(struct cifs_ses *ses, ses->ses_status == SES_GOOD); chan_index = cifs_ses_get_chan_index(ses, server); - /* TODO: introduce ref counting for channels when the can be freed */ + if (chan_index == CIFS_INVAL_CHAN_INDEX) { + spin_unlock(&ses->chan_lock); + spin_unlock(&ses->ses_lock); + + return -EINVAL; + } + spin_unlock(&ses->chan_lock); spin_unlock(&ses->ses_lock); -- 2.39.2