Is there any plan for this ERESTARTSYS leak issue? -- Zhitao Li, at SmartX On Fri, Feb 23, 2024 at 6:31 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Thu, 2024-02-22 at 15:20 +0000, Trond Myklebust wrote: > > On Thu, 2024-02-22 at 06:05 -0500, Jeff Layton wrote: > > > On Wed, 2024-02-21 at 13:48 +0000, Trond Myklebust wrote: > > > > On Wed, 2024-02-21 at 16:20 +0800, Zhitao Li wrote: > > > > > [You don't often get email from zhitao.li@xxxxxxxxxx. Learn why > > > > > this > > > > > is important at https://aka.ms/LearnAboutSenderIdentification ] > > > > > > > > > > Hi, everyone, > > > > > > > > > > - Facts: > > > > > I have a remote NFS export and I mount the same export on two > > > > > different directories in my OS with the same options. There is an > > > > > inflight IO under one mounted directory. And then I unmount > > > > > another > > > > > mounted directory with force. The inflight IO ends up with > > > > > "Unknown > > > > > error 512", which is ERESTARTSYS. > > > > > > > > > > > > > All of the above is well known. That's because forced umount > > > > affects > > > > the entire filesystem. Why are you using it here in the first > > > > place? It > > > > is not intended for casual use. > > > > > > > > > > While I agree Trond's above statement, the kernel is not supposed to > > > leak error codes that high into userland. Are you seeing ERESTARTSYS > > > being returned to system calls? If so, which ones? > > > > The point of forced umount is to kill all RPC calls associated with the > > filesystem in order to unblock the umount. Basically, it triggers this > > code before the unmount starts: > > > > void nfs_umount_begin(struct super_block *sb) > > { > > struct nfs_server *server; > > struct rpc_clnt *rpc; > > > > server = NFS_SB(sb); > > /* -EIO all pending I/O */ > > rpc = server->client_acl; > > if (!IS_ERR(rpc)) > > rpc_killall_tasks(rpc); > > rpc = server->client; > > if (!IS_ERR(rpc)) > > rpc_killall_tasks(rpc); > > } > > > > So yes, that does signal all the way up to the application level, and > > it is very much intended to do so. > > Returning an error to userland in this situation is fine, but userland > programs aren't really equipped to deal with error numbers in this > range. > > Emphasis on the first sentence in the comment in include/linux/errno.h: > > -------------------8<----------------------- > /* > * These should never be seen by user programs. To return one of ERESTART* > * codes, signal_pending() MUST be set. Note that ptrace can observe these > * at syscall exit tracing, but they will never be left for the debugged user > * process to see. > */ > #define ERESTARTSYS 512 > #define ERESTARTNOINTR 513 > #define ERESTARTNOHAND 514 /* restart if no handler.. */ > #define ENOIOCTLCMD 515 /* No ioctl command */ > #define ERESTART_RESTARTBLOCK 516 /* restart by calling sys_restart_syscall */ > #define EPROBE_DEFER 517 /* Driver requests probe retry */ > #define EOPENSTALE 518 /* open found a stale dentry */ > #define ENOPARAM 519 /* Parameter not supported */ > -------------------8<----------------------- > > If these values are leaking into userland, then that seems like a bug. > -- > Jeff Layton <jlayton@xxxxxxxxxx>