Am Donnerstag, den 06.09.2018, 10:34 -0500 schrieb Steve French: > Can you verify that /proc/fs/cifs/Stats (when the hang occurs) does > not show additional session or share reconnects? No reconnects. I've got a bunch of files for you (attached). Maybe some general info: cat /proc/version Linux version 4.18.6-gentoo (root@pst15) (gcc version 8.2.0 (Gentoo 8.2.0-r2 p1.2)) #1 SMP PREEMPT Thu Sep 6 11:53:00 CEST 2018 cifs compiled as a module It really seems to be dependent upon RAM state (uninitialized variable or something?). After one warm reboot, I couldn't reproduce the problem with vers=2.1 or vers=3. After the next cold reboot, I could reproduce both with vers=2.1 and vers=3. BTW, I now found that the hangs don't last infinitely, it just takes minutes. Attached you find: - vers=2.1: Debugdata without and with a hung process, Stats with a hung process . vers=3: dito. I could start other processes to list the same directory (one with 67338 files seemed most "successful" to try, but as I wrote it also occurs for small dirs, either with "ls -l" or python3's os.listdir) I'm now back to 4.17, which works nicely with both vers=2.1 and vers=3. > We have a problem (currently debugging) for which I recently added a > trace message for (in for-next > branch) which occurs when the session drops and we have to reconnect > - > when reconnecting, a previously > issued pending operation fails and its SMB3 credits are credited back > to the wrong (new vs old) session > thus causing the server and client to disagree about number of > operations that can be sent in parallel > which possibly could affect a large directory search). > > Thus my interest if seeing if a reconnect could be involved ... (even > if not due to a network hang) > > Similarly when the hang occurs, would be helpful to know if we are > waiting on the server > (pending 'mids' will be visible for each session by dumping > /proc/fs/cifs/DebugData) > > Do you have the output of /proc/fs/cifs/DebugData so we can see the > session state > and any pending operations? > On Thu, Sep 6, 2018 at 10:25 AM Dr. Bernd Feige > <bernd.feige@xxxxxxxxxxxxxxxxxxxxx> wrote: > > > > Am Donnerstag, den 06.09.2018, 08:36 -0500 schrieb Steve French: > > > To clarify a few things: > > > - are you saying that you had the original older dialect (SMB2.0, > > > vers=2.0) signing problem, but now that that is resolved see > > > occasional hangs in listing directories > > > > Exactly! May of course be that this is a different regression but > > it > > came with 4.18 as well... > > > > I now use vers=3 as mount option (the kernel fills the log with > > warnings about the changed default if I leave it out...). > > /proc/fs/cifs/DebugData (no Stats in there) says that everything is > > Dialect 3 now (see below for an excerpt). > > > > > - do you see any correlation between the size of the directory > > > and > > > hangs > > > > I thought so initially, as I first listed a few subdirs without > > problems and then it hung as I listed one with >16000 entries. but > > then > > it also hung once on the first attempt when listing a smaller top- > > level > > directory. > > > > > - is a reconnect involved (I see mention of the krb5 upcall, > > > which > > > presumably could hang in a reconnect scenario if AD server were > > > not > > > available to refresh the ticket and it had expired)? You can see > > > the > > > number of reconnects (if any) in /proc/fs/cifs/Stats > > > > This all happens within minutes after an AD login, I'm quite sure > > that > > no expiration is involved. > > > > > - if it is a reconnect any idea if intermittent network issue or > > > hung > > > server was the reason for the reconnect? > > > > I switch back and forth between 4.17.13 and 4.18.6, and it happens > > every time I try in 4.18.6 but never in 4.17.13. There's > > definitively > > no connectivity or service problem. > > > > > - for the hung directory examples are you seeing them with smb3 > > > (which > > > presumably is the most common dialect being used and safest) or > > > earlier dialect/ > > > > Yes, if what DebugData reports is correct... > > > > > - what is the server type? > > > > It's a Microsoft system (not samba) which supports up to 3.11 as > > reported by nmap. Is there a way to probe it more exactly? > > > > Note that /proc/fs/cifs/LinuxExtensionsEnabled is 1 although I > > didn't > > specifically request it. > > > > From DebugData: > > Features: dfs spnego xattr acl > > > > DFS server entry: "Dialect 0x302 signed" > > file server entry: "Dialect 0x300" > > PathComponentMax: 255 Status: 1 type: DISK > > Share Capabilities: None Aligned, Partition Aligned, TRIM > > support, Share Flags: 0x30 Optimal sector size: 0x1000 > > > > MIDs: > > State: 2 com: 6 pid: 27772 cbdata: 00000000634d19f4 mid > > 6581 > > > > > On Thu, Sep 6, 2018 at 7:30 AM Dr. Bernd Feige > > > <bernd.feige@xxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > Dear Steve et al., > > > > > > > > I'm running Linux 4.18.6 in a corporate environment and now > > > > have > > > > the > > > > issue that listing directories lets the process hang > > > > interminably, > > > > loading one CPU by 100%. This does not happen every time (i.e. > > > > sometimes a directory listing completes). > > > > > > > > Note that this works solidly with 4.17.13. > > > > > > > > More verbatim: > > > > > > > > I had the problem the OP noted with 4.18.5 during upcall. I had > > > > vers=2.1 in the mount options since the servers used to not > > > > support > > > > vers=3. I didn't get a kernel oops but a hung mount process. It > > > > worked > > > > with 4.17.13. > > > > > > > > Reading this thread, I then dropped the vers= option and found > > > > that > > > > mounts worked again (still with 4.18.5) after confirming: > > > > > > > > nmap -Pn -p 445 --script smb-protocols ad > > > > > > > > PORT STATE SERVICE > > > > 445/tcp open microsoft-ds > > > > > > > > Host script results: > > > > > smb-protocols: > > > > > dialects: > > > > > NT LM 0.12 (SMBv1) [dangerous, but default] > > > > > 2.02 > > > > > 2.10 > > > > > 3.00 > > > > > 3.02 > > > > > _ 3.11 > > > > > > > > However, it may be that the actual mount uses version 2 still: > > > > > > > > Sep 06 09:43:18 cifs.upcall[15995]: key description: > > > > cifs.spnego;0;0;39010000;ver=0x2;host=xxx;ip4=xxx;sec=krb5;uid= > > > > 0x3e > > > > 8;creduid=0x3e8;user=root;pid=0x671b > > > > Sep 06 09:43:18 cifs.upcall[15995]: ver=2 > > > > Sep 06 09:43:18 cifs.upcall[15995]: host=xxx > > > > Sep 06 09:43:18 cifs.upcall[15995]: ip=xxx > > > > Sep 06 09:43:18 cifs.upcall[15995]: sec=1 > > > > Sep 06 09:43:18 cifs.upcall[15995]: uid=1000 > > > > Sep 06 09:43:18 cifs.upcall[15995]: creduid=1000 > > > > Sep 06 09:43:18 cifs.upcall[15995]: user=root > > > > Sep 06 09:43:18 cifs.upcall[15995]: pid=26395 > > > > Sep 06 09:43:18 cifs.upcall[15995]: > > > > get_cachename_from_process_env: pathname=/proc/26395/environ > > > > Sep 06 09:43:18 cifs.upcall[15995]: > > > > get_cachename_from_process_env: read to end of buffer (4096 > > > > bytes) > > > > Sep 06 09:43:18 cifs.upcall[15995]: get_existing_cc: default > > > > ccache is FILE:/tmp/krb5cc_1000 > > > > Sep 06 09:43:18 cifs.upcall[15995]: handle_krb5_mech: getting > > > > service ticket for xxx > > > > Sep 06 09:43:18 cifs.upcall[15995]: handle_krb5_mech: obtained > > > > service ticket > > > > Sep 06 09:43:18 cifs.upcall[15995]: Exit status 0 > > > > > > > > Thanks and best regards, > > > > Bernd > > > > > > > > > > > >
Display Internal CIFS Data Structures for Debugging --------------------------------------------------- CIFS Version 2.12 Features: DFS,UPCALL(SPNEGO),XATTR,ACL Active VFS Requests: 0 Servers: Number of credits: 31999 Dialect 0x210 1) Name: 139 Uses: 1 Capability: 0x300003 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x1 Req On Wire: 0 Shares: 0) IPC: \\139\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 1) \\fsgroup4\group Mounts: 1 DevInfo: 0x20 Attributes: 0x400cf PathComponentMax: 255 Status: 1 type: DISK MIDs: Number of credits: 266 Dialect 0x210 signed 2) Name: 10 Uses: 1 Capability: 0x300007 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x3 Req On Wire: 0 Shares: 0) IPC: \\10\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 1) \\DC1.ad\Group Mounts: 1 DevInfo: 0x20020 Attributes: 0xc700ff PathComponentMax: 255 Status: 1 type: DISK MIDs:
Display Internal CIFS Data Structures for Debugging --------------------------------------------------- CIFS Version 2.12 Features: DFS,STATS,UPCALL(SPNEGO),XATTR,ACL Active VFS Requests: 0 Servers: Number of credits: 582 Dialect 0x210 1) Name: 139 Uses: 1 Capability: 0x300003 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x1 Req On Wire: 0 Shares: 0) IPC: \\139\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 1) \\fsgroup4\group Mounts: 1 DevInfo: 0x20 Attributes: 0x400cf PathComponentMax: 255 Status: 1 type: DISK MIDs: Number of credits: 143 Dialect 0x210 signed 2) Name: 10 Uses: 1 Capability: 0x300007 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x3 Req On Wire: 0 Shares: 0) IPC: \\10\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 1) \\DC1.ad\Group Mounts: 1 DevInfo: 0x20020 Attributes: 0xc700ff PathComponentMax: 255 Status: 1 type: DISK MIDs:
Resources in use CIFS Session: 2 Share (unique mount targets): 4 SMB Request/Response Buffer: 2 Pool size: 6 SMB Small Req/Resp Buffer: 2 Pool size: 30 Operations (MIDs): 0 0 session 0 share reconnects Total vfs operations: 453 maximum at one time: 2 1) \\fsgroup4\group SMBs: 451 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 0 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 0 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 1 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed 2) \\DC1.ad\Group SMBs: 13 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 2 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 0 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 0 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed
Display Internal CIFS Data Structures for Debugging --------------------------------------------------- CIFS Version 2.12 Features: DFS,UPCALL(SPNEGO),XATTR,ACL Active VFS Requests: 0 Servers: Number of credits: 31999 Dialect 0x300 1) Name: 139 Uses: 1 Capability: 0x300053 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x1 Req On Wire: 0 Shares: 0) IPC: \\139\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 Share Capabilities: None Share Flags: 0x0 tid: 0x1 Maximal Access: 0x1f01ff 1) \\fsgroup4\group Mounts: 1 DevInfo: 0x20 Attributes: 0x400cf PathComponentMax: 255 Status: 1 type: DISK Share Capabilities: None Aligned, Partition Aligned, TRIM-support, Share Flags: 0x30 tid: 0x2 Optimal sector size: 0x1000 Maximal Access: 0x1301bf MIDs: Number of credits: 351 Dialect 0x302 signed 2) Name: 10 Uses: 1 Capability: 0x300047 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x3 Req On Wire: 0 Shares: 0) IPC: \\10\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 Share Capabilities: None Share Flags: 0x30 tid: 0x1 Maximal Access: 0x11f01ff 1) \\DC1.ad\Group Mounts: 1 DevInfo: 0x20020 Attributes: 0xc700ff PathComponentMax: 255 Status: 1 type: DISK Share Capabilities: DFS, Aligned, Partition Aligned, Share Flags: 0x803 tid: 0x5 Optimal sector size: 0x200 Maximal Access: 0x1200a9 MIDs: Server interfaces: 1 0) Speed: 2000000000 bps Capabilities: rss IPv4: 10
Display Internal CIFS Data Structures for Debugging --------------------------------------------------- CIFS Version 2.12 Features: DFS,STATS,UPCALL(SPNEGO),XATTR,ACL Active VFS Requests: 0 Servers: Number of credits: 583 Dialect 0x300 1) Name: 139 Uses: 1 Capability: 0x300053 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x1 Req On Wire: 0 Shares: 0) IPC: \\139\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 Share Capabilities: None Share Flags: 0x0 tid: 0x1 Maximal Access: 0x1f01ff 1) \\fsgroup4\group Mounts: 1 DevInfo: 0x20 Attributes: 0x400cf PathComponentMax: 255 Status: 1 type: DISK Share Capabilities: None Aligned, Partition Aligned, TRIM-support, Share Flags: 0x30 tid: 0x2 Optimal sector size: 0x1000 Maximal Access: 0x1301bf MIDs: Number of credits: 142 Dialect 0x302 signed 2) Name: 12 Uses: 1 Capability: 0x300047 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x3 Req On Wire: 0 Shares: 0) IPC: \\12\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 Share Capabilities: None Share Flags: 0x30 tid: 0x1 Maximal Access: 0x11f01ff 1) \\DC2.ad\Group Mounts: 1 DevInfo: 0x20020 Attributes: 0xc700ff PathComponentMax: 255 Status: 1 type: DISK Share Capabilities: DFS, Aligned, Partition Aligned, Share Flags: 0x803 tid: 0x5 Optimal sector size: 0x200 Maximal Access: 0x1200a9 MIDs: Server interfaces: 1 0) Speed: 2000000000 bps Capabilities: rss IPv4: 12
Display Internal CIFS Data Structures for Debugging --------------------------------------------------- CIFS Version 2.12 Features: DFS,STATS,UPCALL(SPNEGO),XATTR,ACL Active VFS Requests: 0 Servers: Number of credits: 16135 Dialect 0x300 1) Name: 139 Uses: 1 Capability: 0x300053 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x1 Req On Wire: 0 Shares: 0) IPC: \\139\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 Share Capabilities: None Share Flags: 0x0 tid: 0x1 Maximal Access: 0x1f01ff 1) \\fsgroup4\group Mounts: 1 DevInfo: 0x20 Attributes: 0x400cf PathComponentMax: 255 Status: 1 type: DISK Share Capabilities: None Aligned, Partition Aligned, TRIM-support, Share Flags: 0x30 tid: 0x2 Optimal sector size: 0x1000 Maximal Access: 0x1301bf MIDs: Number of credits: 226 Dialect 0x302 signed 2) Name: 12 Uses: 1 Capability: 0x300047 Session Status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x3 Req On Wire: 0 Shares: 0) IPC: \\12\IPC$ Mounts: 1 DevInfo: 0x0 Attributes: 0x0 PathComponentMax: 0 Status: 1 type: 0 Share Capabilities: None Share Flags: 0x30 tid: 0x1 Maximal Access: 0x11f01ff 1) \\DC2.ad\Group Mounts: 1 DevInfo: 0x20020 Attributes: 0xc700ff PathComponentMax: 255 Status: 1 type: DISK Share Capabilities: DFS, Aligned, Partition Aligned, Share Flags: 0x803 tid: 0x5 Optimal sector size: 0x200 Maximal Access: 0x1200a9 MIDs: Server interfaces: 1 0) Speed: 2000000000 bps Capabilities: rss IPv4: 12
Resources in use CIFS Session: 2 Share (unique mount targets): 4 SMB Request/Response Buffer: 2 Pool size: 6 SMB Small Req/Resp Buffer: 2 Pool size: 30 Operations (MIDs): 0 0 session 0 share reconnects Total vfs operations: 453 maximum at one time: 2 1) \\fsgroup4\group SMBs: 451 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 0 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 1 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 1 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed 2) \\DC2.ad\Group SMBs: 11 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 2 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 0 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 0 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed
Resources in use CIFS Session: 2 Share (unique mount targets): 4 SMB Request/Response Buffer: 2 Pool size: 6 SMB Small Req/Resp Buffer: 2 Pool size: 30 Operations (MIDs): 0 0 session 0 share reconnects Total vfs operations: 28431 maximum at one time: 3 1) \\fsgroup4\group SMBs: 28223 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 0 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 69 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 26 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed 2) \\DC2.ad\Group SMBs: 62 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 53 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 0 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 0 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed
Resources in use CIFS Session: 2 Share (unique mount targets): 4 SMB Request/Response Buffer: 2 Pool size: 6 SMB Small Req/Resp Buffer: 2 Pool size: 30 Operations (MIDs): 0 0 session 0 share reconnects Total vfs operations: 45443 maximum at one time: 4 1) \\fsgroup4\group SMBs: 44787 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 0 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 94 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 45 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed 2) \\DC2.ad\Group SMBs: 94 Negotiates: 0 sent 0 failed SessionSetups: 0 sent 0 failed Logoffs: 0 sent 0 failed TreeConnects: 0 sent 0 failed TreeDisconnects: 0 sent 0 failed Creates: 0 sent 85 failed Closes: 0 sent 0 failed Flushes: 0 sent 0 failed Reads: 0 sent 0 failed Writes: 0 sent 0 failed Locks: 0 sent 0 failed IOCTLs: 0 sent 0 failed Cancels: 0 sent 0 failed Echos: 0 sent 0 failed QueryDirectories: 0 sent 0 failed ChangeNotifies: 0 sent 0 failed QueryInfos: 0 sent 0 failed SetInfos: 0 sent 0 failed OplockBreaks: 0 sent 0 failed