Hi Dai- Thanks for posting. I agree this is a real bug, but I'm going to quibble a little with the details of your proposed solution. > On Aug 7, 2021, at 1:02 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote: > > Currently my_svc_run does not handle poll time allowing idle TCP s/poll time/poll timeout/ > connections to remain ESTABLISHED indefinitely. When the number > of connections reaches the limit the open file descriptors > (ulimit -n) then accept(2) fails with EMFILE. Since libtirpc does > not handle EMFILE returned from accept(2) Just curious, should libtirpc be handling EMFILE? > this get my_svc_run into > a tight loop calling accept(2) resulting in the RPC service being > down, it's no longer able to service any requests. > > Fix by add call to __svc_destroy_idle to my_svc_run to remove > inactive connections when poll(2) returns timeout. > > Fixes: b2c9430f46c4 Use poll() instead of select() in svc_run() I don't have this commit in my rpcbind repo. I do have 44bf15b86861 ("rpcbind: don't use obsolete svc_fdset interface of libtirpc"). IMO that's the one you should list here. Also, please Cc: Thorsten Kukuk <kukuk@xxxxxxxxxx> on future versions of this patch series in case he has any comments. > Signed-off-by: dai.ngo@xxxxxxxxxx > --- > src/rpcb_svc_com.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/rpcb_svc_com.c b/src/rpcb_svc_com.c > index 1743dadf5db7..cb33519010d3 100644 > --- a/src/rpcb_svc_com.c > +++ b/src/rpcb_svc_com.c > @@ -1048,6 +1048,8 @@ netbuffree(struct netbuf *ap) > } > > > +extern void __svc_destroy_idle(int, bool_t); Your libtirpc patch does not add __svc_destroy_idle() as part of the official libtirpc API. We really must avoid adding "secret" library entry points (not to mention, as SteveD pointed out, a SONAME change would be required). rpcbind used to call __svc_clean_idle(), which is undocumented (as far as I can tell). 44bf15b86861 removed that call. I think a better approach would be to duplicate your proposed __svc_destroy_idle() code in rpcbind, since rpcbind is not actually using the library's RPC dispatcher. That would get rid of the technical debt of calling an undocumented library API. The helper would need to call svc_vc_destroy() rather than __xprt_unregister_unlocked() and __svc_vc_dodestroy(). Also not clear to me that rpcbind's my_svc_run() needs the protection of holding svc_fd_lock, if rpcbind itself is not multi-threaded? The alternative would be to construct a fully documented public API with man pages and header declarations. I'm not terribly comfortable with diverging from the historical Sun TI-RPC programming interface, however. > + > void > my_svc_run() > { > @@ -1076,6 +1078,7 @@ my_svc_run() > * other outside event) and not caused by poll(). > */ > case 0: > + __svc_destroy_idle(30, FALSE); > continue; > default: > /* > -- > 2.26.2 > -- Chuck Lever