Hi Srivatsa, Daniel, Thank you very much for all the comments. :) On 09/05/2012 04:57 AM, Srivatsa S. Bhat wrote:
I had posted a Linux kernel patchset[1] some time ago to expose another file so that we can distinguish between the user specified settings vs the actual scenario underneath. But the conclusion in the ensuing discussion was that the existing kernel behaviour is good as is, and trying to "fix" it would break kernel semantics. (However, note that the suspend/resume case has been fixed in the kernel by commit d35be8bab). [1]. http://thread.gmane.org/gmane.linux.documentation/4805
The reason why I made this patch set is that if libvirt doesn't recover the cpuset.cpus, all the domains with vcpus pinned to a *re-pluged* cpu in xml will fail to start. Which means all these domain will be unusable, or we have to modify the configuration. If the cpu is really removed, it is normal for a domain fails to start. We can simply print an error message. But if the cpu is added again, and it is active and usable, the domain should be able to start normally. (am I right here ?) This is the key problem I want to solve. So first, I improved the netlink related code in libvirt, and now libvirt can be notified when cpu hotplug event occurred. I read the emails posted above. In summary, you discussed about the following problems: 1) Make cgroup be able to distinguish actual configuration and user's. - ( Srivatsa's idea: mask = (actual config) & (user config) ) Seems that it is hard to be applied for some cgroup design reasons. 2) Kill all the tasks on the cpu when hot-unplug it. - I don't think this is a good idea. And, this won't solve the problem. For example, a task binded on cpu 3. Suppose cpu 3 is unpluged, * if the task is killed, it's just too rude, and users running important tasks will suffer. * if the task is migrated to other cpus, what if cpu 3 is active again ? Are we going to see the added cpu 3 is not the original cpu 3 ? Whatever, the domain will still fail to start. 3) Make cpu hot unplug fail when there are tasks on it. - This may be unacceptable for hotplug users. And this won't solve the problem either. If the domain is not running when the hot unplug happens, the hot unplug will succeed. And when we start the domain, it will fail anyway, right ? 4) Make libvirt not use cpuset cgroup. - For now, seems impossable. sched_setaffinity() behaves properly, which assumes the repluged cpu is the same one unpluged before. (am I right ?) But with cgroup's control, we cannot resolve this problem using sched_setaffinity(). If I want to solve the start failure problem, what should I do ? Thanks. :) -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list