On Thu, Jul 11, 2019 at 09:32:17AM +0800, Lijun Ou wrote: > From: Xi Wang <wangxi11@xxxxxxxxxx> > > When run perftest in many times, the system will report a BUG as follows: > > [ 2312.559759] BUG: Bad rss-counter state mm:(____ptrval____) idx:0 val:-1 > [ 2312.574803] BUG: Bad rss-counter state mm:(____ptrval____) idx:1 val:1 > > We tested with different kernel version and found it started from the the > following commit: > > commit d10bcf947a3e ("RDMA/umem: Combine contiguous PAGE_SIZE regions in > SGEs") > > In this commit, the sg->offset is always 0 when sg_set_page() is called in > ib_umem_get() and the drivers are not allowed to change the sgl, otherwise > it will get bad page descriptor when unfolding SGEs in __ib_umem_release() > as sg_page_count() will get wrong result while sgl->offset is not 0. > > However, there is a weird sgl usage in the current hns driver, the driver > modified sg->offset after calling ib_umem_get(), which caused we iterate > past the wrong number of pages in for_each_sg_page iterator. > > This patch fixes it by correcting the non-standard sgl usage found in the > hns_roce_db_map_user() function. > > Fixes: 0425e3e6e0c7 ("RDMA/hns: Support flush cqe for hip08 in kernel space") > Signed-off-by: Xi Wang <wangxi11@xxxxxxxxxx> > --- > drivers/infiniband/hw/hns/hns_roce_db.c | 15 ++++++++------- > 1 file changed, 8 insertions(+), 7 deletions(-) Applied to for-rc Jason