> On Sep 25, 2024, at 6:00 AM, Hanxiao Chen (Fujitsu) <chenhx.fnst@xxxxxxxxxxx> wrote: > > > >> -----邮件原件----- >> 发件人: ltp <ltp-bounces+chenhx.fnst=fujitsu.com@xxxxxxxxxxxxxx> 代表 Chuck >> Lever III via ltp >> 发送时间: 2024年9月12日 23:50 >> 收件人: ltp@xxxxxxxxxxxxxx >> 主题: Re: [LTP] TI-RPC test failures; network configuration related? >> >> >> >>> On Aug 29, 2024, at 3:35 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> >> wrote: >>> >>> For a while now my nightly "runltp -f net.tirpc_tests" have >>> thrown a bunch of failures but I haven't had time to look >>> into it until now. Without modification, about half of the >>> client test programs segfault. >>> >>> Here's a sample test failure. I instrumented the >>> tirpc_clnt_destroy test case and the rpc_tests.sh script as >>> shown below, but I still don't understand why clnt_create(3t) >>> is failing. >>> > > Hi, Chuck > > I can reproduce this issue on my CentOS 10 stream machine with upstream LTP. > libtirpc-1.3.5-0.el10.x86_64 > rpcbind-1.2.7-2.el10.x86_64 > > In my limited investigation, it looks like libtirpc returns NULL > when LTP trying to create client. > > 937 __rpcb_findaddr_timed(program, version, nconf, host, clpp, tp) > ... > 1023 CLNT_CONTROL(client, CLSET_VERS, (char *)(void *)&vers); > 1024 clnt_st = CLNT_CALL(client, (rpcproc_t)RPCBPROC_GETADDR, > 1025 (xdrproc_t) xdr_rpcb, (char *)(void *)&parms, > 1026 (xdrproc_t) xdr_wrapstring, (char *)(void *) &ua, *tp); > > The ua got "" of line 1026 > > 1027 switch (clnt_st) { > 1028 case RPC_SUCCESS: > 1029 if ((ua == NULL) || (ua[0] == 0)) { > 1030 /* address unknown */ > 1031 rpc_createerr.cf_stat = RPC_PROGNOTREGISTERED; > 1032 goto error; > 1033 } > > May be rpcbproc_getaddr_com of rpcbind broken? The program is registered on one of the veth interfaces. The rpcinfo works there. The test program is running on another veth, and it can't see the first veth at all (no route to host). So the clnt_create(3) fails. There is some kind of configuration problem on my test system. Was traveling last week, but I have some time to look at it again now. > Hi, Ma > > Can you fix tirpc cases to let LTP get rid of segfault? All the RPC test programs assume that libtirpc will return a non-NULL clnt, and simply proceed to call CLNT_DESTROY, which segfaults in these error cases. If the test configuration is not correct, the API returns NULL and sets cf_stat. It would be helpful to display the cf_stat error in those cases, and skip CLNT_DESTROY. -- Chuck Lever