> -----Original Message----- > From: Dave Hansen [mailto:dave@xxxxxxxx] > Sent: Wednesday, July 24, 2013 12:43 PM > To: KY Srinivasan > Cc: Dave Hansen; Michal Hocko; gregkh@xxxxxxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; > apw@xxxxxxxxxxxxx; andi@xxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; linux- > mm@xxxxxxxxx; kamezawa.hiroyuki@xxxxxxxxx; hannes@xxxxxxxxxxx; > yinghan@xxxxxxxxxx; jasowang@xxxxxxxxxx; kay@xxxxxxxx > Subject: Re: [PATCH 1/1] Drivers: base: memory: Export symbols for onlining > memory blocks > > On 07/23/2013 10:21 AM, KY Srinivasan wrote: > >> You have allocated some large, physically contiguous areas of memory > >> under heavy pressure. But you also contend that there is too much > >> memory pressure to run a small userspace helper. Under heavy memory > >> pressure, I'd expect large, kernel allocations to fail much more often > >> than running a small userspace helper. > > > > I am only reporting what I am seeing. Broadly, I have two main failure > conditions to > > deal with: (a) resource related failure (add_memory() returning -ENOMEM) > and (b) not being > > able to online a segment that has been successfully hot-added. I have seen > both these failures > > under high memory pressure. By supporting "in context" onlining, we can > eliminate one failure > > case. Our inability to online is not a recoverable failure from the host's point of > view - the memory > > is committed to the guest (since hot add succeeded) but is not usable since it is > not onlined. > > Could you please precisely report on what you are seeing in detail? > Where are the -ENOMEMs coming from? Which allocation site? Are you > seeing OOMs or page allocation failure messages on the console? The ENOMEM failure I see from the call to hot add memory - the call to add_memory(). Usually I don't see any OOM messages on the console. > > The operation was split up in to two parts for good reason. It's > actually for your _precise_ use case. I agree and without this split, I could not implement the balloon driver with hot-add. > > A system under memory pressure is going to have troubles doing a > hot-add. You need memory to add memory. Of the two operations ("add" > and "online"), "add" is the one vastly more likely to fail. It has to > allocate several large swaths of contiguous physical memory. For that > reason, the system was designed so that you could "add" and "online" > separately. The intention was that you could "add" far in advance and > then "online" under memory pressure, with the "online" having *VASTLY* > smaller memory requirements and being much more likely to succeed. > > You're lumping the "allocate several large swaths of contiguous physical > memory" failures in to the same class as "run a small userspace helper". > They are _really_ different problems. Both prone to allocation > failures for sure, but _very_ separate problems. Please don't conflate > them. I don't think I am conflating these two issues; I am sorry if I gave that impression. All I am saying is that I see two classes of failures: (a) Our inability to allocate memory to manage the memory that is being hot added and (b) Our inability to bring the hot added memory online within a reasonable amount of time. I am not sure the cause for (b) and I was just speculating that this could be memory related. What is interesting is that I have seen failure related to our inability to online the memory after having succeeded in hot adding the memory. > > >> It _sounds_ like you really want to be able to have the host retry the > >> operation if it fails, and you return success/failure from inside the > >> kernel. It's hard for you to tell if running the userspace helper > >> failed, so your solution is to move what what previously done in > >> userspace in to the kernel so that you can more easily tell if it failed > >> or succeeded. > >> > >> Is that right? > > > > No; I am able to get the proper error code for recoverable failures (hot add > failures > > because of lack of memory). By doing what I am proposing here, we can avoid > one class > > of failures completely and I think this is what resulted in a better "hot add" > experience in the > > guest. > > I think you're taking a huge leap here: "We could not online memory, > thus we must take userspace out of the loop." > > You might be right. There might be only one way out of this situation. > But you need to provide a little more supporting evidence before we all > arrive at the same conclusion. I am not even suggesting that. All I am saying is that there should be a mechanism for "in context" onlining of memory in addition to the existing sysfs mechanism for bringing memory online from a kernel context. Hyper-V balloon driver can certainly use this functionality. I should be sending out the patches for this shortly. > > BTW, it doesn't _require_ udev. There could easily be another listener > for hotplug events. Agreed; but structurally it is identical to having a udev rule. Regards, K. Y -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href