Re: [PATCH 08/35] autonuma: introduce kthread_bind_node()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2012-05-29 at 18:11 +0200, Andrea Arcangeli wrote:
> On Tue, May 29, 2012 at 02:49:13PM +0200, Peter Zijlstra wrote:
> > On Fri, 2012-05-25 at 19:02 +0200, Andrea Arcangeli wrote:
> > >  /**
> > > + * kthread_bind_node - bind a just-created kthread to the CPUs of a node.
> > > + * @p: thread created by kthread_create().
> > > + * @nid: node (might not be online, must be possible) for @k to run on.
> > > + *
> > > + * Description: This function is equivalent to set_cpus_allowed(),
> > > + * except that @nid doesn't need to be online, and the thread must be
> > > + * stopped (i.e., just returned from kthread_create()).
> > > + */
> > > +void kthread_bind_node(struct task_struct *p, int nid)
> > > +{
> > > +       /* Must have done schedule() in kthread() before we set_task_cpu */
> > > +       if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE)) {
> > > +               WARN_ON(1);
> > > +               return;
> > > +       }
> > > +
> > > +       /* It's safe because the task is inactive. */
> > > +       do_set_cpus_allowed(p, cpumask_of_node(nid));
> > > +       p->flags |= PF_THREAD_BOUND;
> > 
> > No, I've said before, this is wrong. You should only ever use
> > PF_THREAD_BOUND when its strictly per-cpu. Moving the your numa threads
> > to a different node is silly but not fatal in any way.
> 
> I changed the semantics of that bitflag, now it means: userland isn't
> allowed to shoot itself in the foot and mess with whatever CPU
> bindings the kernel has set for the kernel thread.

Yeah, and you did so without mentioning that in your changelog.
Furthermore I object to that change. I object even more strongly to
doing it without mention and keeping a misleading comment near the
definition.

> It'd be a clear regress not to set PF_THREAD_BOUND there. It would be
> even worse to remove the CPU binding to the node: it'd risk to copy
> memory with both src and dst being in remote nodes from the CPU where
> knuma_migrate runs on (there aren't just 2 node systems out there).

Just teach each knuma_migrated what node it represents and don't use
numa_node_id().

That way you can change the affinity just fine, it'll be sub-optimal,
copying memory from node x to node y through node z, but it'll still
work correctly.

numa isn't special in the way per-cpu stuff is special.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]