Re: [PATCH v8 15/18] mm, fs, dax: handle layout changes to pinned dax mappings

Dan Williams <dan.j.williams@xxxxxxxxx> · Fri, 13 Apr 2018 15:03:51 -0700

On Mon, Apr 9, 2018 at 9:51 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> On Mon, Apr 9, 2018 at 9:49 AM, Jan Kara <jack@xxxxxxx> wrote:
>> On Sat 07-04-18 12:38:24, Dan Williams wrote:
> [..]
>>> I wonder if this can be trivially solved by using srcu. I.e. we don't
>>> need to wait for a global quiescent state, just a
>>> get_user_pages_fast() quiescent state. ...or is that an abuse of the
>>> srcu api?
>>
>> Well, I'd rather use the percpu rwsemaphore (linux/percpu-rwsem.h) than
>> SRCU. It is a more-or-less standard locking mechanism rather than relying
>> on implementation properties of SRCU which is a data structure protection
>> method. And the overhead of percpu rwsemaphore for your use case should be
>> about the same as that of SRCU.
>
> I was just about to ask that. Yes, it seems they would share similar
> properties and it would be better to use the explicit implementation
> rather than a side effect of srcu.

...unfortunately:

 BUG: sleeping function called from invalid context at
./include/linux/percpu-rwsem.h:34
 [..]
 Call Trace:
  dump_stack+0x85/0xcb
  ___might_sleep+0x15b/0x240
  dax_layout_lock+0x18/0x80
  get_user_pages_fast+0xf8/0x140

...and thinking about it more srcu is a better fit. We don't need the
100% exclusion provided by an rwsem we only need the guarantee that
all cpus that might have been running get_user_pages_fast() have
finished it at least once.

In my tests synchronize_srcu is a bit slower than unpatched for the
trivial 100 truncate test, but certainly not the 200x latency you were
seeing with syncrhonize_rcu.

Elapsed time:
0.006149178 unpatched
0.009426360 srcu