On Oct 6, 2022, at 12:34 AM, Alexander Atanasov <alexander.atanasov@xxxxxxxxxxxxx> wrote: > Hello, > > > On 5.10.22 20:25, Nadav Amit wrote: >> On Oct 5, 2022, at 2:01 AM, Alexander Atanasov <alexander.atanasov@xxxxxxxxxxxxx> wrote: >>> Add counters to be updated by the balloon drivers. >>> Create balloon notifier to propagate changes. >> I missed the other patches before (including this one). Sorry, but next >> time, please cc me. > > You are CCed in the cover letter since the version. I will add CC to you > in the individual patches if you want so. Thanks. Just to clarify - I am not attacking you. It’s more of me making an excuse for not addressing some issues in earlier versions. >> I was looking through the series and I did not see actual users of the >> notifier. Usually, it is not great to build an API without users. > > > You are right. I hope to get some feedback/interest from potential users that i mentioned in the cover letter. I will probably split the notifier > in separate series. To make it usefull it will require more changes. > See bellow more about them. Fair, but this is something that is more suitable for RFC. Otherwise, more likely than not - your patches would go in as is. >> [snip] >>> + >>> +static int balloon_notify(unsigned long val) >>> +{ >>> + return srcu_notifier_call_chain(&balloon_chain, val, NULL); >> Since you know the inflated_kb value here, why not to use it as an argument >> to the callback? I think casting to (void *) and back is best. But you can >> also provide pointer to the value. Doesn’t it sound better than having >> potentially different notifiers reading different values? > > My current idea is to have a struct with current and previous value, > may be change in percents. The actual value does not matter to anyone > but the size of change does. When a user gets notified it can act upon > the change - if it is small it can ignore it , if it is above some threshold it can act - if it makes sense for some receiver is can accumulate changes from several notification. Other option/addition is to have si_meminfo_current(..) and totalram_pages_current(..) that return values adjusted with the balloon values. > > Going further - there are few places that calculate something based on available memory that do not have sysfs/proc interface for setting limits. Most of them work in percents so they can be converted to do calculations when they get notification. > > The one that have interface for configuration but use memory values can be handled in two ways - convert to use percents of what is available or extend the notifier to notify userspace which in turn to do calculations and update configuration. I really need to see code to fully understand what you have in mind. Division, as you know, is not something that we really want to do very frequently. >> Anyhow, without users (actual notifiers) it’s kind of hard to know how >> reasonable it all is. For instance, is it balloon_notify() supposed to >> prevent further balloon inflating/deflating until the notifier completes? > > No, we must avoid that at any cost. > >> Accordingly, are callers to balloon_notify() expected to relinquish locks >> before calling balloon_notify() to prevent deadlocks and high latency? > > My goal is to avoid any possible impact on performance. Drivers are free to delay notifications if they get in the way. (I see that i need to move the notification after the semaphore in the vmw driver - i missed that - will fix in the next iterration.) > Deadlocks - depends on the users but a few to none will possibly have to deal with common locks. I will need to see the next version to give better feedback. One more thing that comes to mind though is whether saving the balloon size in multiple places (both mem_balloon_inflated_total_kb and each balloon’s accounting) is the right way. It does not sounds very clean. Two other options is to move *all* the accounting to your new mem_balloon_inflated_total_kb-like interface or expose some per-balloon function to get the balloon size (indirect-function-call would likely have some overhead though). Anyhow, I am not crazy about having the same data replicated. Even from reading the code stand-of-view it is not intuitive.