On Thu, 27 Jun 2013 11:26:16 +0200 Michal Hocko <mhocko@xxxxxxx> wrote: > On Wed 26-06-13 23:17:12, Luiz Capitulino wrote: > > Currently, an eventfd is notified for the level it's registered for > > _plus_ higher levels. > > > > This is a problem if an application wants to implement different > > actions for different levels. For example, an application might want > > to release 10% of its cache on level low, 50% on medium and 100% on > > critical. To do this, an application has to register a different > > eventfd for each pressure level. However, fd low is always going to > > be notified and and all fds are going to be notified on level critical. > > > > Strict mode solves this problem by strictly notifiying an eventfd > > for the pressure level it registered for. This new mode is optional, > > by default we still notify eventfds on higher levels too. > > > > Signed-off-by: Luiz Capitulino <lcapitulino@xxxxxxxxxx> > > Two nits bellow but it looks good in general to me. You can add my > Reviewed-by: Michal Hocko <mhocko@xxxxxxx> Thanks. > I still think that edge triggering makes some sense but that one might > be rebased on top of this patch. We should still figure out whether the > edge triggering is the right approach for the use case Hyunhee Kim wants > it for so the strict mode should go first IMO. I also think that edge triggering makes sense. > > --- > > > > o v2 > > > > - Improve documentation > > - Use a bit to store mode instead of a bool > > - Minor changelog changes > > > > Documentation/cgroups/memory.txt | 26 ++++++++++++++++++++++---- > > mm/vmpressure.c | 26 ++++++++++++++++++++++++-- > > 2 files changed, 46 insertions(+), 6 deletions(-) > > > > diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt > > index ddf4f93..412872b 100644 > > --- a/Documentation/cgroups/memory.txt > > +++ b/Documentation/cgroups/memory.txt > > @@ -791,6 +791,22 @@ way to trigger. Applications should do whatever they can to help the > > system. It might be too late to consult with vmstat or any other > > statistics, so it's advisable to take an immediate action. > > > > +Applications can also choose between two notification modes when > > +registering an eventfd for memory pressure events: > > + > > +When in "non-strict" mode, an eventfd is notified for the specific level > > +it's registered for and higher levels. For example, an eventfd registered > > +for low level is also going to be notified on medium and critical levels. > > +This mode makes sense for applications interested on monitoring reclaim > > +activity or implementing simple load-balacing logic. The non-strict mode > > +is the default notification mode. > > + > > +When in "strict" mode, an eventfd is strictly notified for the pressure > > +level it's registered for. For example, an eventfd registered for the low > > +level event is not going to be notified when memory pressure gets into > > +medium or critical levels. This allows for more complex logic based on > > +the actual pressure level the system is experiencing. > > It would be also fair to mention that there is no guarantee that lower > levels are signaled before higher so nobody should rely on seeing LOW > before MEDIUM or CRITICAL. I think this is implied. Actually, as an user of this interface I didn't expect this to happen until I read the code. > > > + > > The events are propagated upward until the event is handled, i.e. the > > events are not pass-through. Here is what this means: for example you have > > three cgroups: A->B->C. Now you set up an event listener on cgroups A, B > > @@ -807,12 +823,14 @@ register a notification, an application must: > > > > - create an eventfd using eventfd(2); > > - open memory.pressure_level; > > -- write string like "<event_fd> <fd of memory.pressure_level> <level>" > > +- write string like "<event_fd> <fd of memory.pressure_level> <level> [strict]" > > to cgroup.event_control. > > > > -Application will be notified through eventfd when memory pressure is at > > -the specific level (or higher). Read/write operations to > > -memory.pressure_level are no implemented. > > +Applications will be notified through eventfd when memory pressure is at > > +the specific level or higher. If strict is passed, then applications > > +will only be notified when memory pressure reaches the specified level. > > + > > +Read/write operations to memory.pressure_level are no implemented. > > > > Test: > > > > diff --git a/mm/vmpressure.c b/mm/vmpressure.c > > index 736a601..ba5c17e 100644 > > --- a/mm/vmpressure.c > > +++ b/mm/vmpressure.c > > @@ -138,8 +138,16 @@ struct vmpressure_event { > > struct eventfd_ctx *efd; > > enum vmpressure_levels level; > > struct list_head node; > > + unsigned int mode; > > You would fill up a hole between level and node if you move it up on > 64b. Doesn't matter much but why not do it... You want me to respin? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>