On Fri, Oct 30, 2020 at 12:11:07PM -0500, Bob Pearson wrote: > The commit referenced below performs additional checking on > devices used for DMA. Specifically it checks that > > device->dma_mask != NULL > > Rdma_rxe uses this device when pinning MR memory but did not > set the value of dma_mask. In fact rdma_rxe does not perform > any DMA operations so the value is never used but is checked. > > This patch gives dma_mask a valid value extracted from the device > backing the ndev used by rxe. > > Without this patch rdma_rxe does not function. > > N.B. This patch needs to be applied before the recent fix to add back > IB_USER_VERBS_CMD_POST_SEND to uverbs_cmd_mask. > > Dennis Dallesandro reported that Parav's similar patch did not apply > cleanly to rxe. This one does to for-next head of tree as of yesterday. > > Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference") > Signed-off-by: Bob Pearson <rpearson@xxxxxxx> > drivers/infiniband/sw/rxe/rxe_verbs.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c > index 7652d53af2c1..c857e83323ed 100644 > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c > @@ -1128,19 +1128,32 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) > int err; > struct ib_device *dev = &rxe->ib_dev; > struct crypto_shash *tfm; > + u64 dma_mask; > > strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc)); > > dev->node_type = RDMA_NODE_IB_CA; > dev->phys_port_cnt = 1; > dev->num_comp_vectors = num_possible_cpus(); > - dev->dev.parent = rxe_dma_device(rxe); > dev->local_dma_lkey = 0; > addrconf_addr_eui48((unsigned char *)&dev->node_guid, > rxe->ndev->dev_addr); > dev->dev.dma_parms = &rxe->dma_parms; > dma_set_max_seg_size(&dev->dev, UINT_MAX); > - dma_set_coherent_mask(&dev->dev, dma_get_required_mask(&dev->dev)); > + > + /* rdma_rxe never does real DMA but does rely on > + * pinning user memory in MRs to avoid page faults > + * in responder and completer tasklets. This code > + * supplies a valid dma_mask from the underlying > + * network device. It is never used but is checked. > + */ > + dev->dev.parent = rxe_dma_device(rxe); Oh! This is another bug, the parent of an ib_device should never be set to a net_device!! This is probably why we get all those mysterious syzkaller faults :| Just leave it NULL > + dma_mask = *(dev->dev.parent->dma_mask); > + err = dma_coerce_mask_and_coherent(&dev->dev, dma_mask); Why not use Parav's logic? Jason