On 2017-07-07 09:20:02 [-0400], Chad Dupuis wrote: > What was the question? My observation is that the patch I proposed fixed > the issue we saw on testing the patch set. With that small change > (essentially modulo by the number of active CPUs vs. the total number) > your patch set worked ok. That mail at the bottom of this mail where I said why I think your patch is a nop in this context. Sebastian On 2017-05-17 17:07:34 [+0200], To Chad Dupuis wrote: > > > Sebastian, can you add this change to your patch set? > > > > Are sure that you can reliably reproduce the issue and fix it with the > > patch above? Because this patch: > > oh. Okay. Now it clicked. It can fix the issue but it is still possible, > that CPU0 goes down between your check for it and schedule_work_on() > returning. Let my think of something… Oh wait. I already thought about this: it may take bnx2fc_percpu from CPU7 and run the worker on CPU3. The job isn't lost, because the worker does: | static void bnx2fc_percpu_io_work(struct work_struct *work_s) | { | struct bnx2fc_percpu_s *p; … | p = container_of(work_s, struct bnx2fc_percpu_s, work); | | spin_lock_bh(&p->fp_work_lock); and so will access bnx2fc_percpu of CPU7 running on CPU3. So I *think* that your patch should make no difference and there should be no leak if schedule_work_on() is invoked on an offline CPU.