On Tue, Nov 15, 2022 at 12:54:39PM -0800, Ira Weiny wrote: > On Tue, Nov 15, 2022 at 02:41:35PM -0600, Bjorn Helgaas wrote: > > On Tue, Nov 15, 2022 at 12:18:38PM -0800, Ira Weiny wrote: > > > On Tue, Nov 15, 2022 at 01:44:24PM -0600, Bjorn Helgaas wrote: > > > > On Mon, Nov 14, 2022 at 05:19:43PM -0800, ira.weiny@xxxxxxxxx wrote: > > > > > From: Ira Weiny <ira.weiny@xxxxxxxxx> > > > > > > > > > > The callers of pci_doe_submit_task() allocate the > > > > > pci_doe_task on the stack. This causes the work structure > > > > > to be allocated on the stack without pci_doe_submit_task() > > > > > knowing. Work item initialization needs to be done with > > > > > either INIT_WORK_ONSTACK() or INIT_WORK() depending on how > > > > > the work item is allocated. > > > > > > > > > > Jonathan suggested creating doe task allocation macros such > > > > > as DECLARE_CDAT_DOE_TASK_ONSTACK().[1] The issue with this > > > > > is the work function is not known to the callers and must be > > > > > initialized correctly. > > > > > > > > > > A follow up suggestion was to have an internal > > > > > 'pci_doe_work' item allocated by pci_doe_submit_task().[2] > > > > > This requires an allocation which could restrict the context > > > > > where tasks are used. > > > > > > > > > > Compromise with an intermediate step to initialize the task > > > > > struct with a new call pci_doe_init_task() which must be > > > > > called prior to submit task. > > > > > > > > I'm not really a fan of passing a parameter to say "this struct is on > > > > the stack" because that seems kind of error-prone and I don't know > > > > what the consequence of getting it wrong would be. Sounds like it > > > > *could* be some memory corruption or reading garbage data that would > > > > be hard to debug. > > > > > > > > Do we have cases today where pci_doe_submit_task() can't do the > > > > kzalloc() as in your patch at [3]? > > No. > > > > > If the current use cases allow a > > > > kzalloc(), why not do that now and defer this until it becomes an > > > > issue? > > I do like pci_doe_submit_task() handling this as an internal detail. > I'm happy with that if you are. > > I was just concerned about the restriction of context. Dan > suggested this instead of passing a gfp parameter. > > If you are happy with my original patch I will submit it instead. > (With a better one liner.) I don't know what's coming as far as pci_doe_submit_task() callers. If there's some imminent caller that will require atomic context, I guess we could solve it now. But DOE doesn't really seem like an atomic context thing to begin with, so maybe we could postpone dealing with it. That patch in [3] is more complicated than I expected, but I admit I haven't looked closely. Bjorn > > > > > [1] https://lore.kernel.org/linux-cxl/20221014151045.24781-1-Jonathan.Cameron@xxxxxxxxxx/T/#m88a7f50dcce52f30c8bf5c3dcc06fa9843b54a2d > > > > > [2] https://lore.kernel.org/linux-cxl/20221014151045.24781-1-Jonathan.Cameron@xxxxxxxxxx/T/#m63c636c5135f304480370924f4d03c00357be667 > > > > > > > > [3] https://lore.kernel.org/linux-cxl/Y2AnKB88ALYm9c5L@iweiny-desk3/