On Mon, Mar 12, 2018 at 04:19:01PM -0500, Steve Wise wrote: > > On Mon, Mar 12, 2018 at 11:14:28AM -0600, Jason Gunthorpe wrote: > > > On Mon, Mar 12, 2018 at 12:03:55PM -0500, Steve Wise wrote: > > > > > > > > > > On Mon, Mar 12, 2018 at 06:05:31PM +0200, Leon Romanovsky wrote: > > > > > > On Mon, Mar 12, 2018 at 09:22:32AM -0600, Jason Gunthorpe wrote: > > > > > > > On Mon, Mar 12, 2018 at 04:16:16PM +0200, Leon Romanovsky wrote: > > > > > > > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > The reason to this patchset is described in "RDMA/mlx4: Clear > all > > > > > > > > allocated memory by default" accommodated with relevant crash > > > > report. > > > > > > > > > > > > > > Well, but you never explained what the actual problem is, and > I've > > > > failed > > > > > > > to guess... > > > > > > > > > > > > 109 void rdma_restrack_add(struct rdma_restrack_entry *res) > > > > > > 110 { > > > > > > 111 struct ib_device *dev = res_to_dev(res); > > > > > > 112 > > > > > > 113 if (!dev) > > > > > > 114 return; > > > > > > 115 > > > > > > 116 if (res_is_user(res)) { > > > > > > 117 if (!res->task) > > > > > > > > > > > > ^^^^^^^^^^^ this part wasn't zero in new > objects, > > > > > > sometimes it failed in QP (mlx4), sometimes it failed in CQ (mlx5) > and > > > > my > > > > > > grep showed that other drivers in worst situation. > > > > > > > > > > So this is only broken recently by 00313983 ? > > > > > > > > > > That is a pretty big change to require all users to provide zero'd > > > > > memory.. There are certainly simpler ways to fix this than changing > > > > > every driver.. > > > > > > > > > > What do you think Steve? > > > > > > > > > > > > > Perhaps an rdma_restrack_entry_init()? It would be called everywhere > the > > > > restrack_entry is embedded in an object struct. > > > > > > I was thinking simply a flag to rdma_restrack_add that lets the caller > > > specify the kernel_name and task are inited... > > > > It will look awful. > > > > Hiding the initialization of restrack->task via kzalloc() of the parent > struct is pretty awful too, I think. Yes, I agree. Very hhard to maintain. I don't like to trust drivers to do the right thing. They usually don't. Given all the pushback in the last few hours, this looks NAK'd to me.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html