On 18/10/2023 12:45, Markus Armbruster wrote: > "Zhijian Li (Fujitsu)" <lizhijian@xxxxxxxxxxx> writes: > >> On 17/10/2023 16:01, Markus Armbruster wrote: >>> Li Zhijian <lizhijian@xxxxxxxxxxx> writes: >>> >>>> 'errno' is being widely used by applications when ibv_reg_mr returns NULL. >>>> They all believe errno indicates the error on failure, so let's document >>>> it explicitly. >>> >>> Similar issue with ibv_open_device() . Possibly more. >> >> You are right, ibv_open_device()'s call chains are more complicated, >> I have not figured out if it ought to set errno though QEMU relies on it. > > I think a question to answer is for what purposes callers need errno. > > The only callers I know are in QEMU. There are three: > > * qemu_rdma_reg_whole_ram_blocks() and qemu_rdma_register_and_get_keys() > > When ibv_reg_mr() fails, maybe try again with IBV_ACCESS_ON_DEMAND > added to the protection attributes. > > "Maybe": if errno is ENOTSUP, and ibv_query_device_ex() reports > IBV_ODP_SUPPORT. librpma[1] is another project that registers ODP MR like this. https://github.com/pmem/rpma/blob/f52c00d18821ac573a71e9f23a6d2e8695086e95/src/peer.c#L277 ibv_reg_mr() will evolve to kernel via ioctl() generally, the when the libc wrapper will set the errno. > > * qemu_rdma_broken_ipv6_kernel() > > This function appears to probe the devices returned by > ibv_get_device_list(). > > For each device in the list, in order: try to ibv_open_device(). If > it fails: ignore the device if errno is EPERM, else return failure. DPDK read the errno after calling ibv_open_device()[2] and ibv_get_device_list()[3] [2] https://github.com/DPDK/dpdk/blob/5f9426b0618b7c2899f4d1444768f62739da1bce/drivers/net/mlx4/mlx4.c#L829 [3] https://github.com/DPDK/dpdk/blob/5f9426b0618b7c2899f4d1444768f62739da1bce/drivers/net/mlx4/mlx4.c#L802 I also think these APIs' intention are going to use the errno to indicate error reason, but they haven't been done yet? > > I'm not familiar with RDMA, and I can't say whether any of this makes > sense. > > If it doesn't, we need to talk about what problem the QEMU code is > trying to solve, and how to solve it properly. > > If it does, we have legitimate uses of errno, and we need to talk how to > make errno usable safely, or else how to replace its use in QEMU. >