Hi Chuck > -----Original Message----- > From: Chuck Lever III <chuck.lever@xxxxxxxxxx> > Sent: 2024年9月26日 4:39 > To: Chen, Hanxiao/陈 晗霄 <chenhx.fnst@xxxxxxxxxxx> > Cc: Linux NFS Mailing List <linux-nfs@xxxxxxxxxxxxxxx>; ltp@xxxxxxxxxxxxxx; Ma, > Xinjian/马 新建 <maxj.fnst@xxxxxxxxxxx>; Steve Dickson > <SteveD@xxxxxxxxxx> > Subject: Re: TI-RPC test failures; network configuration related? > > > > > On Sep 25, 2024, at 6:00 AM, Hanxiao Chen (Fujitsu) > <chenhx.fnst@xxxxxxxxxxx> wrote: > > > > > > > >> -----邮件原件----- > >> 发件人: ltp <ltp-bounces+chenhx.fnst=fujitsu.com@xxxxxxxxxxxxxx> 代表 > >> Chuck Lever III via ltp > >> 发送时间: 2024年9月12日 23:50 > >> 收件人: ltp@xxxxxxxxxxxxxx > >> 主题: Re: [LTP] TI-RPC test failures; network configuration related? > >> > >> > >> > >>> On Aug 29, 2024, at 3:35 PM, Chuck Lever III > >>> <chuck.lever@xxxxxxxxxx> > >> wrote: > >>> > >>> For a while now my nightly "runltp -f net.tirpc_tests" have thrown a > >>> bunch of failures but I haven't had time to look into it until now. > >>> Without modification, about half of the client test programs > >>> segfault. > >>> > >>> Here's a sample test failure. I instrumented the tirpc_clnt_destroy > >>> test case and the rpc_tests.sh script as shown below, but I still > >>> don't understand why clnt_create(3t) is failing. > >>> > > > > Hi, Chuck > > > > I can reproduce this issue on my CentOS 10 stream machine with upstream > LTP. > > libtirpc-1.3.5-0.el10.x86_64 > > rpcbind-1.2.7-2.el10.x86_64 > > > > In my limited investigation, it looks like libtirpc returns NULL when > > LTP trying to create client. > > > > 937 __rpcb_findaddr_timed(program, version, nconf, host, clpp, tp) ... > > 1023 CLNT_CONTROL(client, CLSET_VERS, (char *)(void > *)&vers); > > 1024 clnt_st = CLNT_CALL(client, > (rpcproc_t)RPCBPROC_GETADDR, > > 1025 (xdrproc_t) xdr_rpcb, (char *)(void *)&parms, > > 1026 (xdrproc_t) xdr_wrapstring, (char *)(void *) > &ua, *tp); > > > > The ua got "" of line 1026 > > > > 1027 switch (clnt_st) { > > 1028 case RPC_SUCCESS: > > 1029 if ((ua == NULL) || (ua[0] == 0)) { > > 1030 /* address unknown */ > > 1031 rpc_createerr.cf_stat = > RPC_PROGNOTREGISTERED; > > 1032 goto error; > > 1033 } > > > > May be rpcbproc_getaddr_com of rpcbind broken? > > The program is registered on one of the veth interfaces. > The rpcinfo works there. The test program is running on another veth, and it > can't see the first veth at all (no route to host). So the clnt_create(3) fails. > > There is some kind of configuration problem on my test system. Was traveling > last week, but I have some time to look at it again now. > > > > Hi, Ma > > > > Can you fix tirpc cases to let LTP get rid of segfault? > > All the RPC test programs assume that libtirpc will return a non-NULL clnt, and > simply proceed to call CLNT_DESTROY, which segfaults in these error cases. > > If the test configuration is not correct, the API returns NULL and sets cf_stat. It > would be helpful to display the cf_stat error in those cases, and skip > CLNT_DESTROY. Got it, I will send patches to get rid of segfault in LTP. Best regards Ma