On Sat 28-12-19 20:33:32, John Hubbard wrote: > On 12/27/19 1:56 PM, John Hubbard wrote: > ... > >> It is ancient verification test (~10y) which is not an easy task to > >> make it understandable and standalone :). > >> > > > > Is this the only test that fails, btw? No other test failures or hints of > > problems? > > > > (Also, maybe hopeless, but can *anyone* on the RDMA list provide some > > characterization of the test, such as how many pins per page, what page > > sizes are used? I'm still hoping to write a test to trigger something > > close to this...) > > > > I do have a couple more ideas for test runs: > > > > 1. Reduce GUP_PIN_COUNTING_BIAS to 1. That would turn the whole override of > > page->_refcount into a no-op, and so if all is well (it may not be!) with the > > rest of the patch, then we'd expect this problem to not reappear. > > > > 2. Active /proc/vmstat *foll_pin* statistics unconditionally (just for these > > tests, of course), so we can see if there is a get/put mismatch. However, that > > will change the timing, and so it must be attempted independently of (1), in > > order to see if it ends up hiding the repro. > > > > I've updated this branch to implement (1), but not (2), hoping you can give > > this one a spin? > > > > git@xxxxxxxxxx:johnhubbard/linux.git pin_user_pages_tracking_v11_with_diags > > > > > > Also, looking ahead: > > a) if the problem disappears with the latest above test, then we likely have > a huge page refcount overflow, and there are a couple of different ways to > fix it. > > b) if it still reproduces with the above, then it's some other random mistake, > and in that case I'd be inclined to do a sort of guided (or classic, unguided) > git bisect of the series. Because it could be any of several patches. > > If that's too much trouble, then I'd have to fall back to submitting a few > patches at a time and working my way up to the tracking patch... It could also be that an ordinary page reference is dropped with 'unpin' thus underflowing the page refcount... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR