On Tue, Jan 30, 2018 at 08:42:44PM +0000, Bart Van Assche wrote: > On Tue, 2018-01-30 at 12:46 -0700, Jason Gunthorpe wrote: > > On Tue, Jan 30, 2018 at 07:07:35PM +0000, Bart Van Assche wrote: > > > On Tue, 2018-01-30 at 09:33 -0700, Jason Gunthorpe wrote: > > > > On Tue, Jan 30, 2018 at 10:16:01AM -0600, Steve Wise wrote: > > > > > > > > > What is this a merge of exactly? I don't see the restrack stuff, for > > > > > instance. > > > > > > > > Yesterday's for-next. You could merge it with the latest for-next.. > > > > > > > > I updated it. > > > > > > > > I think we are done now, so for-next is what will be sent as the > > > > pull-request and for-next-merged is the conflict resolution. > > > > > > Hello Jason, > > > > > > Although I have not yet tried to root-cause this, I want to let you know > > > that with your for-linus-merged branch the following error message is > > > reported if I try to run the srp-test software against the rdma_rxe driver: > > > > > > id_ext=0x505400fffe4a0b7b,ioc_guid=0x505400fffe4a0b7b,dest=192.168.122.76:5555,t > > > arget_can_queue=1,queue_size=32,max_cmd_per_lun=32,max_sect=131072 >/sys/class/i > > > nfiniband_srp/srp-rxe0-1/add_target failed: Cannot allocate memory > > > > > > In the kernel log I found the following: > > > > > > Jan 30 10:55:50 ubuntu-vm kernel: scsi host4: ib_srp: FR pool allocation failed (-12) > > > > > > With your for-next branch from a few days ago the same test ran fine. > > > > I don't have a guess for you.. > > > > The difference between for-next and merged is only the inclusion of > > v4.15? Could some v4.15 non-rdma code be causing issue here? > > Hello Jason, > > I should have mentioned that in the previous tests I ran I merged kernel > v4.15-rc9 myself into the RDMA for-next branch. So this behavior was probably > introduced by a patch that was queued recently on the RDMA for-next branch, > e.g. RDMA resource tracking. Ok, I think that is the only likely thing recently.. But your print above must be caused by this line, right: static struct srp_fr_pool *srp_create_fr_pool(struct ib_device *device, struct ib_pd *pd, int pool_size, int max_page_list_len) { ret = -ENOMEM; pool = kzalloc(sizeof(struct srp_fr_pool) + pool_size * sizeof(struct srp_fr_desc), GFP_KERNEL); if (!pool) goto err; Since you didn't report the ib_alloc_mr() print it can't be the other ENOMEM case? Hard to see how that interesects with resource tracking.. Are you thinking memory corruption? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html