Re: [PATCH 4/4] nvdimm: Trigger the device probe on a cpu local to the device

Dan Williams <dan.j.williams@xxxxxxxxx> · Tue, 11 Sep 2018 22:48:40 -0700

On Mon, Sep 10, 2018 at 4:44 PM, Alexander Duyck
<alexander.duyck@xxxxxxxxx> wrote:
> From: Alexander Duyck <alexander.h.duyck@xxxxxxxxx>
>
> This patch is based off of the pci_call_probe function used to initialize
> PCI devices. The general idea here is to move the probe call to a location
> that is local to the memory being initialized. By doing this we can shave
> significant time off of the total time needed for initialization.
>
> With this patch applied I see a significant reduction in overall init time
> as without it the init varied between 23 and 37 seconds to initialize a 3GB
> node. With this patch applied the variance is only between 23 and 26
> seconds to initialize each node.
>
> I hope to refine this further in the future by combining this logic into
> the async_schedule_domain code that is already in use. By doing that it
> would likely make this functionality redundant.

Yeah, it is a bit sad that we schedule an async thread only to move it
back somewhere else.

Could we trivially achieve the same with an
async_schedule_domain_on_cpu() variant? It seems we can and the
workqueue core will "Do the right thing".

I now notice that async uses the system_unbound_wq and work_on_cpu()
uses the system_wq.  I don't think we want long running nvdimm work on
system_wq.