On Thu, Mar 09, 2017 at 12:20:14PM +0800, Yi Zhang wrote: > > > I'm using CX5-LX device and have not seen any issues with it. > > > > Would it be possible to retest with kmemleak? > > > Here is the device I used. > > Network controller: Mellanox Technologies MT27500 Family [ConnectX-3] > > The issue always can be reproduced with about 1000 time. > > Another thing is I found one strange phenomenon from the log: > > before the OOM occurred, most of the log are about "adding queue", and > after the OOM occurred, most of the log are about "nvmet_rdma: freeing > queue". > > seems the release work: "schedule_work(&queue->release_work);" not executed > timely, not sure whether the OOM is caused by this reason. Sagi, The release function is placed in global workqueue. I'm not familiar with NVMe design and I don't know all the details, but maybe the proper way will be to create special workqueue with MEM_RECLAIM flag to ensure the progress? > > Here is the log before/after OOM > http://pastebin.com/Zb6w4nEv > > > _______________________________________________ > > Linux-nvme mailing list > > Linux-nvme@xxxxxxxxxxxxxxxxxxx > > http://lists.infradead.org/mailman/listinfo/linux-nvme > > > _______________________________________________ > Linux-nvme mailing list > Linux-nvme@xxxxxxxxxxxxxxxxxxx > http://lists.infradead.org/mailman/listinfo/linux-nvme
Attachment:
signature.asc
Description: PGP signature